Summary

About this document and our methodology

This markdown document contains internal analysis of Creative Footprint Project Copenhagen data. PennPraxis combined CFP survey data with spatial and economic data from local and national sources to understand the relationships between the urban environment and its populations, and CFP venues and their characteristics. Copenhagen’s venue database and related urban data can now be compared to that of other CFP cities.

Any of the graphics or tables in this document can be rendered as vector files (pdf, eps) for use in a final report. They can be styled in Adobe Illustrator or other similar software in the graphic production process. The maps in this document are not heavily styled, as it is expected that the vectors will be manipulated into production quality once relevant graphics are selected for publication.

This document has an appendix outlining methodology and source documentation.

Top-Line Analytical Findings

Top takeaways:

  1. Copenhagen has a very high concentration of venues per capita and per area - as high as any CFP city. These venues are spread widely across the city. Every district in the city has venues, which is unique among CFP cities.

  2. The venue sample contains 108 venues, all of which are in the City of Copenhagen.

  3. Panels of local experts gave Copenhagen’s venues very high ratings for programming quality. Venues were rated as being more focused on creativity and promotion of artistic content than in other CFP cities. This is particularly notable with “Legacy” venues - older, larger venues.

  4. Copenhagen’s urban form, spatial economics, and the characteristics of the venues in the city follow similar patterns to other transit-oriented CFP cities, with a few exceptions.

  5. Copenhagen’s venues are far less concentrated in the urban center than in peer cities. There are very distinct clusters across the city. Based on the community’s high rating of late night transportation access and Copenhagen’s reputation, this might be related to high quality urban accessibility. Widespread, accessible, multi-modal transit presents an opportunity to cultivate investments in programming without the relatively high costs usually associated with a venue’s location in a transit accessible space.

  6. Copenhagen’s venue mix is healthy, but lacks spaces under 500 m^2 category. It has a high quantity of multi-use spaces (68%), and there is more Warehouse use than in other cities.

  7. Presently, there is a good opportunity for intervention to preserve affordability - increases in rent over the last five years are lower than those observed in peer CFP cities such as Rotterdam. The Kommuneplan 2019 estimates substantial growth in the near future, and important areas such as Norrbro are forecast to experience gentrification - housing or development plans will need to keep up to maintain supply and affordability, and direct mainstream cultural offers to appropriate areas.

SAMPLE CHARACTERISTICS

  1. The CFP sample consists of 108 cultural venues.

  2. The scope of the geographic and economic analysis includes the City of Copenhagen’s 10 districts. It does not include other municipalities in the Capital Region.

VENUE CHARACTERISTICS

  1. Program ratings were, on average, extraordinarily high, specifically with regards to Creative Output, Experimentation, and Promotion of artistic content.(Section 2.2) Copenhagen’s experimentation, and promotion ratings were the highest of any CFP cities, creative output was second to only Sydney. Community focus was second lowest of any city. There were notably few venues that were assessed poorly across all categories. Programming characteristics are generally similar between venue types, including venues that also have restaurant programming. There was some slightly higher assessment for clubs.

  2. The venue sample venue sample is has a mix of venues of various sizes, but lacks venues under 100 m^2 - a critical part of a “healthy” venue ladder. (Section 1.2.1). The larger the venue, the less likely it is to have a community or experimentally-focused program (Section 2). We only have 8% of our venues in this smallest category, and this sample size is too small to see meaningful measurements about programming, but the global trend is that the smallest venues have the highest community programming ratings.

The sample is dominated by venues between 100 and 500 square meters (~34%), and large venues over 1000 square meters (36%).

  1. Venue age profile is similar to other CFP cities (Section 1.2.4). Legacy venues (over 20+ years) tend to be in the center (Indry By and neighboring districts), but, the Copenhagen sample is notably not as center-heavy compared to other cities.

  2. Multi-use venues are widespread, and there is notable over-representation of Warehouse, Studio, and Open Air uses compared to other CFP cities, and under-representation of restaurant uses. This suggests more of a content focus relative to other cities, where multi-functional restaurants host many music events.

  3. Copenhagen’s venues fall into three general groups - Legacy Venues, Mainstream Venues, and Creative Engines. Copenhagen is unique in that older, larger “Legacy Venues” have very highly rated programming. Creative Engines - smaller, younger, more experimental venues - are (as is typical) located further outside the center. We categorized the venues using a machine learning classification algorithm.

VENUES AND THE CITY

  1. GEOGRAPHY OF DISTRICTS Venues in Copenhagen are notably diffuse across the city. There are numerous venue-rich districts and clusters across the city. (Sections ????) Copenhagen has here is no single district with a density among the top 25 districts in the CFP sample. Indre By and Norrebro, the densest districts, are 1/2 as dense as Rotterdam’s Centrum (2020) or Berlin’s Friedrichshain-Kreuzberg (2017), and 1/4 as dense as New York’s East Village (2018). (Section 6.5)

  2. Copenhagen is the most venue dense CFP city - it has as many venues per square kilometer or per person as any CFP city (Section 6.5). With the exception of Sydney, where the sampling area was only the densest subset of the metro region, Copenhagen has the highest per capita and per area density of venues in the CFP (but fewer than Nashville, which was assessed with a very similar methodology).

  3. Copenhagen’s venues are notably closer to one another, on average, than any other CFP city. This perhaps reflects the human-scale that Copenhagen is known for - in its streetscaping, and in its transportation accessibility. (Section 6.5). This presents opportunities for nightlife management, which is most efficiently done by focusing on districts for matters of safety, circulation, lighting, sound etc.,

  4. Venue densities, programming ratings and urban variables show similar correlations as other CFP cities. (Section 6.3). Copenhagen has a relatively “strong center” urban form - with transit and rents concentrated in the center - and venue density corresponds to transit density and higher local rents. Higher investments in programming tend to take place in areas where rents are relatively lower (and access is somewhat poorer). Copenhagen’s compact urban form and mass transit-centered transportation system makes it more similar to other European and Asian CFP cities and less so North American and Australian CFP cities. (Section 6.4.5).

  5. Like many cities, the relatively higher rents of the city center (in this case Indre By) are associated with lower content ratings. The Norrebro district, though relatively central, has among the highest program ratings of any district (Section 5, Section 9) Notably, it lacks any Mainstream venues. Indre By - unlike many downtowns, has a substantial number of well regarded venues - largely the highly rated Legacy Venues.

  6. Despite the relatively high cost of living in Copenhagen, property values are not rising as fast as in some peer cities recently researched by CFP. Incomes are rising faster than property values - it is unclear what this suggests about changes in the local economy. (Section 6.1)

  7. District Profiles - The following districts are profiled by their economic and cultural characteristics in (Section 9):

  • Indre By
  • Norrebro
  • Vesterbro / Kongens Enghave
  • Osterbro
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
# ---- Load Packages ----
#list of packages


library(tidyverse)
library(data.table)
library(sf)
library(viridis)
library(viridisLite)
#library(translateR)
library(kableExtra)
library(leaflet)
library(leaflet.extras)
library(lubridate)
library(mapview)
library(spdep)
library(FNN)

# ---- Load MUSA 5080 functions -----

# Import MUSA 5080 functions
source("https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/functions.r")

# ---- Load Graphic Palettes ----

# Load a ggplot theme and a color palette

plotTheme <- theme(
  plot.title =element_text(size=12),
  plot.subtitle = element_text(size=8),
  plot.caption = element_text(size = 6),
  axis.text.x = element_text(size = 10, angle = 45, hjust = 1),
  axis.text.y = element_text(size = 10),
  axis.title.y = element_text(size = 10),
  # Set the entire chart region to blank
  panel.background=element_blank(),
  plot.background=element_blank(),
  #panel.border=element_rect(colour="#F0F0F0"),
  # Format the grid
  panel.grid.major=element_line(colour="#D0D0D0",size=.75),
  axis.ticks=element_blank())

mapTheme <- theme(plot.title =element_text(size=12),
                  plot.subtitle = element_text(size=8),
                  plot.caption = element_text(size = 6),
                  axis.line=element_blank(),
                  axis.text.x=element_blank(),
                  axis.text.y=element_blank(),
                  axis.ticks=element_blank(),
                  axis.title.x=element_blank(),
                  axis.title.y=element_blank(),
                  panel.background=element_blank(),
                  panel.border=element_blank(),
                  panel.grid.major=element_line(colour = 'transparent'),
                  panel.grid.minor=element_blank(),
                  legend.direction = "vertical", 
                  legend.position = "right",
                  plot.margin = margin(1, 1, 1, 1, 'cm'),
                  legend.key.height = unit(1, "cm"), legend.key.width = unit(0.2, "cm"))

palette <- c("#10142A", "#47E9B9", "#F55D60", "#71EA48", "#C148EA", "#EAC148")
viridisPalette <- c("#440154", "#73D055", "#F55D60", "#238A8D", "#FDE725")
CityPalette <- c("#A230C2", "#3AEAB8", "#F9C700", "#0069FC", 
                 "#EF3340", "#bdbdbd", "#bcbddc", "#10142A") # Berlin, NYC, Tokyo, Stockholm, Montreal, Sydney
# Read in d1_aggregates from elsewhere
d1_aggregates <- st_read("~/GitHub/CFP/unified__city_data/district_aggregates/d1_aggregates_11_16_22.geojson") %>%
  st_as_sf(crs = 4326)

# Load in target city d2_aggregates... do a quick text parsing to remove unwanted text in d2_name

d2_aggregates <- st_read("~/GitHub/CFP/city_data_engineer/copenhagen_data_aggregation/output/Copenhagen_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326) %>%
  mutate(d2_name = str_replace(d2_name, "District - ", ""))

# Load the d2_aggregates from elsewhere

# REad in all the individual cities and bind together - go sf to data frame to sf if geojson has non 4326 EPSG
# Add any missing column names (e.g. Rotterdam)

d2_aggregates_all <- st_read("~/GitHub/CFP/city_data_engineer/berlin_data_aggregation/output/Berlin_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/montreal_data_aggregation/output/Montreal_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326)) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/newyork_data_aggregation/output/NewYork_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326)) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/sydney_data_aggregation/output/Sydney_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326)) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/tokyo_data_aggregation/output/Tokyo_d2_aggregates.geojson") %>%
  st_as_sf(crs = 4326)) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/stockholm_data_aggregation/output/Stockholm_d2_aggregates.geojson") %>%
  as.data.frame() %>%
    st_as_sf(crs = 3006) %>%
    st_transform(4326)) %>%
  rbind(., st_read("~/GitHub/CFP/city_data_engineer/rotterdam_data_aggregation/output/Rotterdam_d2_aggregates.geojson") %>%
  as.data.frame() %>%
    mutate(pop_men_t1 = NA, pop_women_t1 = NA, pop_men_t2 = NA, pop_women_t2 = NA) %>%
    st_as_sf(crs = 3035) %>%
    st_transform(crs = 4326))


# Read in venues
main_venue_data <- read.csv("~/GitHub/CFP/city_data_engineer/data_venues/output/main_venue_data_2024.csv") %>%
  mutate(city = ifelse(city == "Solna", "Stockholm", city))

# Read in EPSG information

source("~/GitHub/CFP/unified__city_data/epsg_list.R")
# Load a list of special wards, do a quick join to assign english names to the d1

# Perhaps this should get put in the district_aggregation code

# list of the GEOID for the 23 central wards of Tokyo that are included in the analysis.
special_wards <- c("13101",
                   "13102",
                   "13103",
                   "13104",
                   "13105",
                   "13106",
                   "13107",
                   "13108",
                   "13109",
                   "13110",
                   "13111",
                   "13112",
                   "13113",
                   "13114",
                   "13115",
                   "13116",
                   "13117",
                   "13118",
                   "13119",
                   "13120",
                   "13121",
                   "13122",
                   "13123")


d1_aggregates <- left_join(d1_aggregates %>%
                    mutate(d1_name_en = ifelse(city_en != "Tokyo", as.character(d1_name), NA),
                           d1_id = as.character(d1_id)), 
                           as.data.frame(cbind(c("Chiyoda", 
                                    "Chuo",
                                    "Minato",
                                    "Shinjuku",
                                    "Bunkyo",
                                    "Taito",
                                    "Sumida",
                                    "Koto",
                                    "Shinagawa",
                                    "Meguro",
                                    "Ota",
                                    "Setagaya",
                                    "Shibuya",
                                    "Nakano",
                                    "Suginami",
                                    "Toshima",
                                    "Kita",
                                    "Arakawa",
                                    "Itabashi",
                                    "Nerima",
                                    "Adachi",
                                    "Katsu-Shika",
                                    "Edo Gawa"),
                                  special_wards)),
                  by = c("d1_id" = "special_wards")) %>%
                    mutate(d1_name_en = if_else(is.na(d1_name_en) == TRUE, as.character(V1), 
                                                d1_name_en))%>%
                    select(-V1)
# Remove former arrondissements from d1 aggregates that are now towns since 2006 reorganization
# EXCEPT Mont Royal
remove_arrondissements <- c("Baie-d'Urfé",
                           "Beaconsfield",
                           "Senneville",
                           "Kirkland",
                           "Montréal-Est",
                           "Sainte-Anne-de-Bellevue",
                           "Dorval",
                           "Pointe-Claire")

1 Venues

The data sample has 108 venues

main_venue_data %>% filter (city == "Copenhagen") %>% tally()
##     n
## 1 108

1.1 Interactive Map

This interactive map shows all CFP venues - this is for internal use only.

Hover over any venue with your cursor to see some information about the data point.

Sources: CFP

l <- leaflet() %>% 
  addProviderTiles(providers$Esri.WorldTopoMap) %>%
  setView(lng = mean(main_venue_data$x, na.rm = TRUE),
          lat = mean(main_venue_data$y, na.rm = TRUE),
          zoom = 2) %>%
  addScaleBar(position = "topleft") %>%
  addCircleMarkers(data= main_venue_data %>%
                     mutate(name = iconv(name, to = "UTF-8")),
                   lng=~x, 
                   lat=~y,
                   radius =~ 1, 
                   fillOpacity =~ 1,
                   color =~ "blue",
                   label=~paste(name, " | ", street, " | uid: ", uid, "| Year: ", year ))

l
# Note the conversion above using iconv to get the Japanese characters to UTF-8

1.2 Venue Characteristics

1.2.1 Venue Size Chart

The venue “ladder” is dominated by 100-500 sqm (34%) and large venues (36%) but lacks small (sub 100 square meter) venues (just 8%). Such spaces are important for the development of local artists and community scenes.

The relatively small sample of sub 100 sqm venues means that cross tabulations about the nature of these spaces (programming, uses etc., are probably not useful).

Sources: CFP

ggplot(data = main_venue_data %>%
         filter(size != 0,
                is.na(city) == FALSE,
                city == "Copenhagen") %>%
         mutate(size = ifelse(size == 5, 4, size)) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100",
                                 size == 2 ~ "2. 101-500",
                                 size == 3 ~ "3. 501-1000",
                                 size == 4 ~ "4. 1001+")) %>%
         group_by(size) %>%
         tally())+
  geom_bar(aes(y = n, x = size), stat = 'identity', 
           fill = CityPalette[8], alpha = 0.6)+
  #scale_fill_viridis_d()+
  labs(
    title = "Venue Size Distribution - Copenhagen (Square Meters)",
    subtitle = "",
    x="",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

1.2.2 Venue Size Table

main_venue_data %>%
         filter(size != 0,
                is.na(city) == FALSE,
                city == "Copenhagen") %>%
         mutate(size = ifelse(size == 5, 4, size)) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m2",
                                 size == 2 ~ "2. 101-500 m2",
                                 size == 3 ~ "3. 501-1000 m2",
                                 size == 4 ~ "4. 1001+ m2")) %>%
         group_by(size) %>%
         tally() %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)) %>%
  kable() %>%
  kable_styling() %>%
    scroll_box(width = "650px", height = "400px")
size n pct
  1. < 100 m2
9 8.33
  1. 101-500 m2
37 34.26
  1. 501-1000 m2
23 21.30
  1. 1001+ m2
39 36.11

1.2.3 Venue Size - Multi-City Comparison

The pattern we are observing is not uncommon - the distribution of sizes is similar to New York and Stockholm.

Sources: CFP

main_venue_data %>%
         filter(size != 0,
                is.na(city) == FALSE) %>%
         mutate(size = as.numeric(size),
                size = ifelse(size == 5, 4, size)) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100",
                                 size == 2 ~ "2. 101-500",
                                 size == 3 ~ "3. 501-1000",
                                 size == 4 ~ "4. 1001+"),
                city  = case_when(city == "Berlin" ~ "1. BERLIN, 2017",
                                  city == "New York" ~ "2. NEW YORK CITY, 2018",
                                  city == "Tokyo" ~ "3. TOKYO, 2019",
                                  city == "Stockholm" ~ "4. STOCKHOLM, 2021",
                                  city == "Montreal" ~ "5. MONTREAL, 2022",
                                  city == "Sydney" ~ "6. SYDNEY, 2023",
                                  city == "Rotterdam" ~ "7. ROTTERDAM, 2024",
                                  city == "Copenhagen" ~ "8. COPENHAGEN, 2025")) %>%
         group_by(size, city) %>%
         tally() %>%
         ungroup() %>%
         group_by(city) %>%
         mutate(pct = 100*n/sum(n)) %>%
  ggplot()+
  geom_bar(aes(x = size, y = pct, fill = city), 
           stat = "identity", position = "dodge",
           alpha = 0.6) +
  scale_fill_manual(values = c(CityPalette[1], CityPalette[2], CityPalette[3], 
                               CityPalette[4], CityPalette[5], CityPalette[6],
                               CityPalette[7], CityPalette[8]))+
  facet_wrap(~city)+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "Venue Size - CFP Cities",
    subtitle = "",
    x="Size (Square Meters)",
    y="Percentage of Venues",
    fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

#ggsave("rotterdam_images/multi_citysize.pdf", width = 8, height = 11, units = "in")

1.2.3.2 Venue size Dynamic Map

Larger venues are more commonly found in outlying areas.

mapView(main_venue_data %>% 
          filter(city == "Copenhagen",
                 is.na(x) == FALSE, is.na(y) == FALSE) %>% 
          mutate(size = case_when(size == 1 ~ "1. < 100 m2",
                                 size == 2 ~ "2. 101-500 m2",
                                 size == 3 ~ "3. 501-1000 m2",
                                 size == 4 ~ "4. 1001+ m2")) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326), zcol = "size" )

1.2.4.2 Venue Size and Use Type

 main_venue_data %>%
         filter(size != 0,
                is.na(city) == FALSE,
                city == "Copenhagen") %>%
         mutate(size = ifelse(size == 5, 4, size)) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m2",
                                 size == 2 ~ "2. 101-500 m2",
                                 size == 3 ~ "3. 501-1000 m2",
                                 size == 4 ~ "4. 1001+ m2")) %>%
                  select(size, venueType_disco, venueType_club, 
                         venueType_concertHall, venueType_musicBar, 
                         venueType_restaurant, venueType_gallery)%>%
  gather(-size, key = "variable", value = "value") %>%
  filter(value != 0, is.na(value) == FALSE) %>%
  group_by(size, variable) %>%
  tally() %>%
ggplot()+ 
  geom_bar(aes(y = n, x = size), stat = "identity", 
           alpha = 0.6, fill = CityPalette[7]) + 
  facet_wrap(~variable, scales = "free")+ 
  labs(
    title = "Venue Size by Use",
    subtitle = "",
    caption = "Data: CFP")+
  ylab("Number of Venues")+
  xlab("Size (Square Meters)")+
  plotTheme

1.2.4 Venue Age Chart

There is a fairly healthy balance of venue age, similar to Sydney. Notably fewer 0-3 year old venues compared to other CFP cities.

Sources: CFP

ggplot(data = main_venue_data %>%
         filter(is.na(years_operating) == FALSE,
                years_operating != "",
                is.na(city) == FALSE,
                city == "Copenhagen") %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         tally()%>%
         mutate(pct = round(100*(n/sum(n)), digits = 2))%>%
         ungroup())+
  geom_bar(aes(y = n, x = years_operating), stat = 'identity', 
           fill = CityPalette[8], alpha = 0.6)+
  #scale_fill_viridis_d()+
  labs(
    title = "Venue Age (Years)",
    subtitle = "Data only available for x of y venues in sample",
    x="",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP")+
  coord_flip()+
  plotTheme

1.2.5 Venue Age Table

New element here - comparing Copenhagen to the three other post-pandemic CFP cities to compare age profiles. It’s not clear how well the pre-pandemic data relate - they are in very non-standard formats and the data quality is not clear. I’m pretty sure there is something wrong with the Tokyo data, specifically.

main_venue_data %>%
         filter(is.na(years_operating) == FALSE,
                years_operating != "",
                is.na(city) == FALSE,
                city == "Copenhagen") %>%
  mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         group_by(years_operating) %>%
         tally()%>%
         mutate(pct = round(100*(n/sum(n)), digits = 2))%>%
         ungroup() %>%
  cbind(., main_venue_data %>%
         filter(is.na(years_operating) == FALSE,
                years_operating != "",
                is.na(city) == FALSE,
                city %in% c("Rotterdam", "Sydney", "Montreal")) %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         tally()%>%
         mutate(other_cfp_cities_pct = round(100*(n/sum(n)), digits = 2))%>%
         ungroup() %>%
    select(-years_operating, -n)) %>%
  kable() %>%
  kable_styling() %>%
    scroll_box(width = "650px", height = "400px")
years_operating n pct other_cfp_cities_pct
  1. 0-3
13 12.15 15.99
  1. 4-10
37 34.58 27.24
  1. 11-20
23 21.50 18.80
  1. 20+
34 31.78 37.96

Sources: CFP

main_venue_data %>%
         filter(is.na(years_operating) == FALSE,
                years_operating != "",
                is.na(city) == FALSE) %>%
         #mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating %in% c("0", "0-3", "0-Jan", "1",
                                                                   "1.0") ~ "1. 0-3",
                                            years_operating %in% c("3-10",  "2", "4-10",
                                                                   "2.0") ~ "2. 4-10",
                                            years_operating %in% c("3", "10-20", "11-20",
                                                                   "3.0") ~ "3. 11-20",
                                            years_operating %in% c("4", "20+", "4\n2",
                                                                   "4.0") ~ "4. 20+")) %>%
         mutate(years_operating = ifelse(city == "Tokyo" & year(ymd(year_opened)) < 1997,"4. 20+", years_operating)) %>%
         filter(is.na(years_operating) == FALSE) %>%
         mutate(city  = case_when(city == "Berlin" ~ "1. BERLIN, 2017",
                                  city == "New York" ~ "2. NEW YORK CITY, 2018",
                                  city == "Tokyo" ~ "3. TOKYO, 2019",
                                  city == "Stockholm" ~ "4. STOCKHOLM, 2021",
                                  city == "Montreal" ~ "5. MONTREAL, 2022",
                                  city == "Sydney" ~ "6. SYDNEY, 2023",
                                  city == "Rotterdam" ~ "7. ROTTERDAM, 2024",
                                  city == "Copenhagen" ~ "8. COPENHAGEN, 2024")) %>%
         group_by(years_operating, city) %>%
         tally()%>%
         ungroup %>%
         group_by(city) %>%
         mutate(pct = round(100*(n/sum(n)), digits = 2))%>%
         ungroup() %>%
  ggplot()+
  geom_bar(aes(x = years_operating, y = pct, fill = city), 
           stat = "identity", position = "dodge",
           alpha = 0.6) +
  scale_fill_manual(values = c(CityPalette[1], CityPalette[2], CityPalette[3], 
                               CityPalette[4], CityPalette[5], CityPalette[6],
                               CityPalette[7], CityPalette[8]))+
  facet_wrap(~city)+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "Venue Age - CFP Cities",
    subtitle = "",
    x="Years operating",
    y="Percentage of Venues",
    fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

1.2.6 Venue Age Map

mapView(main_venue_data %>% 
          filter(city == "Copenhagen",
                 is.na(x) == FALSE, is.na(y) == FALSE) %>% 
          mutate(years_operating = case_when(years_operating %in% c("0", "0-3", "0-Jan", "1",
                                                                   "1.0") ~ "1. 0-3",
                                            years_operating %in% c("3-10",  "2", "4-10",
                                                                   "2.0") ~ "2. 4-10",
                                            years_operating %in% c("3", "10-20", "11-20",
                                                                   "3.0") ~ "3. 11-20",
                                            years_operating %in% c("4", "20+", "4\n2",
                                                                   "4.0") ~ "4. 20+")) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326), zcol = "years_operating" )

2 Programming Variables

Sources: CFP

2.1 Programming Variables Overiview

Program ratings were, on average, extraordinarily high, specifically with regards to Creative Output, Experimentation, and Promotion of artistic content.

Some observations:

  • Copenhagen’s experimentation, and promotion ratings were the highest of any CFP cities, creative output was second to only Sydney. Community focus was second lowest of any city.

  • There were notably few venues that were assessed poorly across all categories.

  • Programming characteristics are generally similar between venue types, including venues that also have restaurant programming. There was some slightly higher assessment for clubs.

main_venue_data %>% 
                   filter(city == "Copenhagen") %>%
                  select(experimentation, creative_output, 
                         community_focus, promotion)%>%
  gather(key = "variable", value = "value") %>%
  filter(value != 0) %>%
  mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                              variable == "experimentation" ~ "Experimentation",
                              variable == "creative_output" ~ "Creative Output",
                              variable == "promotion" ~ "Promotion")) %>%
ggplot()+ 
  geom_histogram(aes(value), position = "dodge", binwidth = 1, alpha = 0.6, fill = CityPalette[8]) + 
  facet_wrap(~variable)+ 
  labs(
    title = "Distribution of Programming Variables",
    subtitle = "",
    caption = "Data: CFP")+
  ylab("Number of venues")+
  xlab("Rating (1-4)")+
  plotTheme

main_venue_data %>% 
                   filter(city == "Copenhagen") %>%
                  mutate(cumulative_score = experimentation + creative_output +
                         community_focus+ promotion)%>%
ggplot()+ 
  geom_histogram(aes(cumulative_score), position = "dodge", binwidth = 1, alpha = 0.6, fill = CityPalette[8]) + 
  labs(
    title = "Cumulative Programming Variables by Venue",
    subtitle = "Four categories scored 1-4",
    caption = "Data: CFP")+
  ylab("Number of venues")+
  xlab("Cumulative rating")+
  plotTheme

main_venue_data %>% 
                  mutate(cumulative_score = experimentation + creative_output +
                         community_focus+ promotion)%>%
ggplot()+ 
  geom_histogram(aes(cumulative_score), position = "dodge", binwidth = 1, alpha = 0.6, fill = CityPalette[8]) + 
  labs(
    title = "Cumulative Programming Variables by Venue",
    subtitle = "Four categories scored 1-4",
    caption = "Data: CFP")+
  ylab("Number of venues")+
  xlab("Cumulative rating")+
  facet_wrap(~city, scales = "free")+
  plotTheme

Note - NYC and Berlin cumulative ratings are from 1-9 because Community Focus was not asked yet.

main_venue_data %>% 
   mutate(community_focus = ifelse(is.na(community_focus) == TRUE, 0, community_focus),
          cumulative_score = experimentation + creative_output + community_focus + promotion,
          mean_score = ifelse(city %in% c("Berlin", "New York"), 
                              cumulative_score / 3, cumulative_score / 4 ))%>%
  select(city, cumulative_score, mean_score) %>%
gather(-city, value = "value", key = "variable") %>%
  group_by(city, variable) %>%
  summarize(mean_score = mean(value, na.rm = TRUE),
            score_sd = sd(value, na.rm = TRUE)) %>%
  ggplot()+
  geom_bar(aes(x = city, y = mean_score), stat = "identity")+
  facet_wrap(~variable, scales = "free")+
  plotTheme

main_venue_data %>% 
    select(city, experimentation, creative_output, 
           community_focus, promotion)%>%
    gather(-city, key = "variable", value = "value") %>%
    filter(value != 0) %>%
    mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                                variable == "experimentation" ~ "Experimentation",
                                variable == "creative_output" ~ "Creative Output",
                                variable == "promotion" ~ "Promotion")) %>%
    ggplot()+ 
    geom_density(aes(value, color = variable), 
                 position = "dodge", 
                 binwidth = 1, alpha = 0.6) + 
    facet_wrap(~city)+ 
    labs(
        title = "Distribution of Programming Variables",
        subtitle = "",
        caption = "Data: CFP")+
    ylab("")+
    xlab("Rating (1-4)")+
    plotTheme

2.1.1. Programming ratings by venue use type

This is a new section - we look at each use type by program rating, and we also evaluate this against the entire CFP data set (with the modern collection protocol for use).

In Copenhagen (and across the data set), restaurants, discos and arenas have relatively low ratings across all indicators. Galleries, shops, studios - these have relatively higher indicators.

main_venue_data %>% 
  filter(city == "Copenhagen") %>%
select(experimentation, creative_output, community_focus, 
       promotion, contains("venueType")) %>%
    pivot_longer(
        cols = starts_with("venueType_"), 
        names_to = "venueType",
        values_to = "count") %>%
    filter(count > 0)  %>% 
  group_by(venueType) %>% 
  summarize(mean_experimentation = round(mean(experimentation), digits = 1), 
            mean_community_focus = round(mean(community_focus), digits = 1), 
            mean_promotion = round(mean(promotion), digits = 1), 
            mean_creative_output = round(mean(creative_output), digits = 1)) %>%
  gather(-venueType, key = "variable", value = "value") %>%
  ggplot()+
  geom_bar(aes(x = variable, y = value, fill = variable), stat = "identity") +
  coord_flip()+
  facet_wrap(~venueType)+
  labs(
        title = "Average Programming Rating by Use Type",
        subtitle = "",
        caption = "Data: CFP")+
    xlab("Number of venues")+
    ylab("Rating (1-4)")+
  theme(legend.position = "none")+
  plotTheme

main_venue_data %>% 
  filter(city == "Copenhagen") %>%
select(experimentation, creative_output, community_focus, 
       promotion, contains("venueType")) %>%
    pivot_longer(
        cols = starts_with("venueType_"), 
        names_to = "venueType",
        values_to = "count") %>%
    filter(count > 0)  %>% 
  group_by(venueType) %>% 
  summarize(mean_experimentation = round(mean(experimentation), digits = 1), 
            mean_community_focus = round(mean(community_focus), digits = 1), 
            mean_promotion = round(mean(promotion), digits = 1), 
            mean_creative_output = round(mean(creative_output), digits = 1)) %>%
  kable() %>%
  kable_styling()
venueType mean_experimentation mean_community_focus mean_promotion mean_creative_output
venueType_arena 1.2 1.2 3.0 3.2
venueType_cinema 2.3 2.7 3.3 3.0
venueType_club 3.1 3.1 3.7 3.7
venueType_concertHall 2.8 2.7 3.4 3.6
venueType_disco 1.3 1.6 2.1 1.9
venueType_gallery 3.7 3.1 3.6 3.9
venueType_musicBar 2.2 2.5 3.0 3.4
venueType_openAir 2.8 2.6 3.2 3.5
venueType_restaurant 2.7 2.8 2.8 3.0
venueType_shop 3.4 3.4 2.8 3.4
venueType_studio 3.3 3.2 3.3 3.5
venueType_theater 3.0 3.5 3.0 3.2
venueType_warehouse 2.7 2.4 2.7 3.0
main_venue_data %>% 
  filter(city %in% c("Sydney", "Montreal", "Rotterdam", "Stockholm", "Copenhagen" )) %>%
select(experimentation, creative_output, community_focus, 
       promotion, contains("venueType")) %>%
    pivot_longer(
        cols = starts_with("venueType_"), 
        names_to = "venueType",
        values_to = "count") %>%
    filter(count > 0)  %>% 
  group_by(venueType) %>% 
  summarize(mean_experimentation = round(mean(experimentation, na.rm = TRUE), digits = 1), 
            mean_community_focus = round(mean(community_focus, na.rm = TRUE), digits = 1), 
            mean_promotion = round(mean(promotion, na.rm = TRUE), digits = 1), 
            mean_creative_output = round(mean(creative_output, na.rm = TRUE), digits = 1)) %>%
  gather(-venueType, key = "variable", value = "value") %>%
  ggplot()+
  geom_bar(aes(x = variable, y = value, fill = variable), stat = "identity") +
  coord_flip()+
  facet_wrap(~venueType)+
  labs(
        title = "Average Programming Variables by Use Type - CFP Cities 2020-Present",
        subtitle = "Current data collection protocol includes Stockholm, Montreal, Sydney, Rotterdam and Copenhagen",
        caption = "Data: CFP")+
    xlab("Number of venues")+
    ylab("Rating (1-4)")+
  theme(legend.position = "none")+
  plotTheme

2.1.2 Cumulative Program Rating Map

main_venue_data %>% 
          filter(city == "Copenhagen",
                 is.na(x) == FALSE, is.na(y) == FALSE) %>% 
                  mutate(cumulative_score = experimentation + creative_output +
                         community_focus+ promotion) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326) %>%
          mapView(., zcol = "cumulative_score" )+
  mapview(d2_aggregates)

2.2 Programming Summaries by City

Rotterdam was ranked poorly relative to other cities on average programming scores - having the lowest or second lowest ranking in every category of the 7 cities.

main_venue_data %>% 
                  select(city, experimentation, creative_output, 
                         community_focus, promotion)%>%
  gather(-city, key = "variable", value = "value") %>%
  filter(value != 0) %>%
  mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                              variable == "experimentation" ~ "Experimentation",
                              variable == "creative_output" ~ "Creative Output",
                              variable == "promotion" ~ "Promotion")) %>%
  group_by(variable, city) %>%
  summarize(mean = round(mean(value), digits = 1)) %>%
  spread(city, mean) %>%
  kable() %>%
  kable_styling()
variable Berlin Copenhagen Montreal New York Rotterdam Stockholm Sydney Tokyo
Community Focus NA 2.5 2.7 NA 2.6 2.0 2.9 2.7
Creative Output 2.5 3.2 2.9 3.1 2.7 2.7 3.5 2.7
Experimentation 2.4 2.6 2.4 2.4 2.2 1.8 2.3 2.4
Promotion 2.5 3.0 2.9 2.7 2.1 2.2 2.8 3.2
main_venue_data %>% 
                  select(city, experimentation, creative_output, 
                         community_focus, promotion)%>%
  gather(-city, key = "variable", value = "value") %>%
  filter(value != 0) %>%
  mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                              variable == "experimentation" ~ "Experimentation",
                              variable == "creative_output" ~ "Creative Output",
                              variable == "promotion" ~ "Promotion"),
         city  = case_when(city == "Berlin" ~ "1. BERLIN, 2017",
                                  city == "New York" ~ "2. NEW YORK CITY, 2018",
                                  city == "Tokyo" ~ "3. TOKYO, 2019",
                                  city == "Stockholm" ~ "4. STOCKHOLM, 2021",
                                  city == "Montreal" ~ "5. MONTREAL, 2022",
                           city == "Sydney" ~ "6. SYDNEY, 2023",
                           city == "Rotterdam" ~ "7. ROTTERDAM, 2024",
                           city == "Copenhagen" ~ "8. COPENHAGEN, 2024")) %>%
  group_by(variable, city) %>%
  summarize(mean = round(mean(value), digits = 1)) %>%
  ungroup() %>%
  ggplot()+
  geom_bar(aes(y = mean, x = city, fill = city), stat = "identity", alpha = 0.6)+
  scale_fill_manual(values = CityPalette) + 
  facet_wrap(~variable)+
  coord_flip()+
  plotTheme

2.3 Experimental Program Measurements

Q: Compared to other venues in the city: Is this venue a platform for new and experimental trends, sounds and art forms? Is it a place for niche genres and experimental performers as well as extraordinary event concepts?

ggplot(data = main_venue_data %>%
         filter(is.na(experimentation) == FALSE,
                city == "Copenhagen"))+
  geom_bar(aes(experimentation), fill = CityPalette[8], alpha = 0.6, size = 3)+
  #scale_fill_viridis_d()+
  labs(
    title = "Likelihood of experimental program",
    subtitle = "1 = Not at all likely -> 4 = Very likely",
    x="Experimentation Score (low to high)",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

2.3.1 Cross-tabs - Experimentation by size chart

The overall pattern holds amongst venue size categories.

ggplot(data = main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                size != "",
                experimentation > 0) %>%
         mutate(experimentation = case_when(experimentation == 1 ~ "1. Not At All",
                                            experimentation == 2 ~ "2. Not too",
                                            experimentation == 3 ~ "3. Somewhat",
                                            experimentation == 4 ~ "4. Very"),
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size, experimentation) %>%
         tally()%>%
         ungroup() %>%
         group_by(size) %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)))+
        geom_bar(aes(y = n, x = size, fill = experimentation), 
                 stat = 'identity', position = "dodge",
                 alpha = 0.6)+ 
  scale_fill_viridis_d(guide = guide_legend())+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "EXPERIMENTAL CONTENT SCORES BY VENUE SIZE",
    subtitle = "Venues of all sizes rate low for experimental programming",
    x="",
    y="Number of venues in size category",
    fill = "LIKELIHOOD OF EXPERIMENTAL CONTENT",
    caption = "Data: CFP"
  )+
  plotTheme

#ggsave("sydney_images/experimentation_size.png")

2.3.2 Cross-tabs - Experimentation by size table

We see basically the same shake-out here as we see in other cities.

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                experimentation > 0) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size) %>%
         summarize(count=n(),
           mean_experimentation = round(mean(experimentation, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  cbind(., main_venue_data %>% 
         filter(is.na(size) == FALSE,
                size != "",
                experimentation > 0) %>%
         mutate(
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
          filter(is.na(size) == FALSE) %>%
         group_by(size) %>%
   summarize(cfp_average = round(mean(experimentation, na.rm = TRUE), digits = 2)) %>%
    select(-size)) %>%
  kable() %>%
  kable_styling()
size count mean_experimentation cfp_average
  1. < 100 m^2
9 2.78 2.39
  1. 101-500 m^2
37 2.73 2.35
  1. 501-1000 m^2
23 2.52 2.43
  1. 1001 m^2+
39 2.46 2.25

2.2.3 Cross-tabs - Experimentation by age table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                experimentation > 0) %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         summarize(count=n(),
           mean_experimentation = round(mean(experimentation, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
years_operating count mean_experimentation
  1. 0-3
13 2.77
  1. 4-10
37 2.62
  1. 11-20
23 2.52
  1. 20+
34 2.59

2.4. Creative Output

Q: Do artists in this venue perform live sets and/or original works? If DJing, is it performed to a level of artistic merit?

2.4.1 Cross-tabs - Creative Output by size chart

Interestingly, much higher creative content scores further down in the venue ladder, size-wise.

ggplot(data = main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                size != "",
                creative_output > 0) %>%
         mutate(creative_output = case_when(creative_output == 1 ~ "1. Not At All",
                                            creative_output == 2 ~ "2. Not too",
                                            creative_output == 3 ~ "3. Somewhat",
                                            creative_output == 4 ~ "4. Very"),
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size, creative_output) %>%
         tally()%>%
         ungroup() %>%
         group_by(size) %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)))+
        geom_bar(aes(y = n, x = size, fill = creative_output), 
                 stat = 'identity', position = "dodge",
                 alpha = 0.6)+
  scale_fill_viridis_d(guide = guide_legend())+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "Creative Output by Venue Size",
    subtitle = "Subtitle",
    x="",
    y="Number of venues in size category",
    fill = "Likelihood of Original Creative Content",
    caption = "Data: CFP"
  )+
  plotTheme

#ggsave("sydney_images/creativeOutput.png")

2.4.3 Cross-tabs - Creative Output by age table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                creative_output > 0) %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         summarize(count=n(),
           mean_creative_output = round(mean(creative_output, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
years_operating count mean_creative_output
  1. 0-3
13 3.23
  1. 4-10
37 2.84
  1. 11-20
23 3.26
  1. 20+
34 3.59

2.4.2 Cross-tabs - Creative Output by size table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                creative_output > 0) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size) %>%
         summarize(count=n(),
           mean_creative_output = round(mean(creative_output, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
size count mean_creative_output
  1. < 100 m^2
9 3.56
  1. 101-500 m^2
37 3.43
  1. 501-1000 m^2
23 3.04
  1. 1001 m^2+
39 3.05

2.5. Community Focused Program Measurements

Q: Is the venue a consistent and regular platform for a niche genre and a stage for its emerging acts? Is it a hub for certain marginalised and/or underrepresented groups, scenes, milieus or is it a hotspot for the immediate neighbourhood to mingle? Is it known as an inclusive space for LGBTQIA+ artists/performers and audiences? Do venues platform other specific communities? Do they strive to create an inclusive environment?

ggplot(data = main_venue_data %>%
         filter(is.na(community_focus) == FALSE,
                city == "Copenhagen"))+
  geom_bar(aes(community_focus), fill = CityPalette[8], alpha = 0.6, size = 3)+
  #scale_fill_viridis_d()+
  labs(
    title = "Likelihood of community focused program",
    subtitle = "1 = Not at all likely -> 4 = Very likely",
    x="Community Focus Score (low to high)",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

2.5.1 Cross-tabs - Community Focus by size chart

The largest venues are least likely to be community focused. No surprises there.

ggplot(data = main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                size != "",
                community_focus > 0) %>%
         mutate(community_focus = case_when(community_focus == 1 ~ "1. Not At All",
                                            community_focus == 2 ~ "2. Not too",
                                            community_focus == 3 ~ "3. Somewhat",
                                            community_focus == 4 ~ "4. Very"),
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size, community_focus) %>%
         tally()%>%
         ungroup() %>%
         group_by(size) %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)))+
        geom_bar(aes(y = n, x = size, fill = community_focus), 
                 stat = 'identity', position = "dodge",
                 alpha = 0.6)+
  scale_fill_viridis_d(guide = guide_legend())+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "Community Focus by Venue Size",
    subtitle = "",
    x="",
    y="Percentage of venues in size category",
    fill = "Likelihood of Community Focused Program",
    caption = "Data: CFP"
  )+
  plotTheme

2.5.2 Cross-tabs - Community Focus by size table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                community_focus > 0) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size) %>%
         summarize(count=n(),
           mean_community_focus = round(mean(community_focus, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
size count mean_community_focus
  1. < 100 m^2
9 2.78
  1. 101-500 m^2
37 2.84
  1. 501-1000 m^2
23 2.61
  1. 1001 m^2+
39 2.10

2.5.3 Cross-tabs - Community Focus by age table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                community_focus > 0) %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         summarize(count=n(),
           mean_community_focus = round(mean(community_focus, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
years_operating count mean_community_focus
  1. 0-3
13 2.54
  1. 4-10
37 2.30
  1. 11-20
23 2.61
  1. 20+
34 2.71

What is the global trend across the data set?

ALL CITIES

main_venue_data %>% 
         filter(is.na(size) == FALSE,
                size != "",
                community_focus > 0) %>%
         mutate(
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size) %>%
  summarize(mean_community_focus = round(mean(community_focus, na.rm = TRUE), digits = 2)) %>%
  kable() %>%
  kable_styling()
size mean_community_focus
  1. < 100 m^2
2.83
  1. 101-500 m^2
2.66
  1. 501-1000 m^2
2.67
  1. 1001 m^2+
2.26
NA 3.00

2.6. Promotion Program Measurements

Q: Is the promotion/marketing of this space focused on artistic content (artists, lineups, performances)? Are musicians the main reason why people attend these venues, and not e.g. culinary offers?

Copenhagen is very highly rated in this component, especially amongst the oldest category of venues and the small/medium size range.

ggplot(data = main_venue_data %>%
         filter(is.na(promotion) == FALSE,
                city == "Copenhagen"))+
  geom_bar(aes(promotion), fill = CityPalette[8], alpha = 0.6, size = 3)+
  #scale_fill_viridis_d()+
  labs(
    title = "Likelihood of promotion focused on artistic content",
    subtitle = "1 = Not at all likely -> 4 = Very likely",
    x="Promotional Score (low to high)",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

2.6.1 Cross-tabs - Promotional Focus on Artistic Content by size chart

Artistic content promotion is notably highly rated here, especially amongst older venues.

ggplot(data = main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                size != "",
                promotion > 0) %>%
         mutate(promotion = case_when(promotion == 1 ~ "1. Not At All",
                                            promotion == 2 ~ "2. Not too",
                                            promotion == 3 ~ "3. Somewhat",
                                            promotion == 4 ~ "4. Very"),
                size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size, promotion) %>%
         tally()%>%
         ungroup() %>%
         group_by(size) %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)))+
        geom_bar(aes(y = n, x = size, fill = promotion), 
                 stat = 'identity', position = "dodge",
                 alpha = 0.6)+
  scale_fill_viridis_d(guide = guide_legend())+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  labs(
    title = "Likelihood Of Promotion of Artistic Content by Venue Size",
    subtitle = "",
    x="",
    y="Percentage of venues in size category",
    fill = "Likelihood of Artistic Promotion",
    caption = "Data: CFP"
  )+
  plotTheme

2.5.2 Cross-tabs - Community Focus by size table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                promotion > 0) %>%
         mutate(size = case_when(size == 1 ~ "1. < 100 m^2",
                                        size == 2 ~ "2. 101-500 m^2",
                                        size == 3 ~ "3. 501-1000 m^2",
                                        size %in% c(4, 5) ~ "4. 1001 m^2+")) %>%
         group_by(size) %>%
         summarize(count=n(),
           mean_promotion = round(mean(promotion, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
size count mean_promotion
  1. < 100 m^2
9 3.33
  1. 101-500 m^2
37 3.16
  1. 501-1000 m^2
23 3.04
  1. 1001 m^2+
39 2.67

2.5.3 Cross-tabs - Promotional Focus by age table

main_venue_data %>% 
         filter(city == 'Copenhagen',
                is.na(size) == FALSE,
                promotion > 0) %>%
         mutate(years_operating = as.numeric(years_operating)) %>%
         mutate(years_operating = case_when(years_operating == 1 ~ "1. 0-3",
                                            years_operating == 2 ~ "2. 4-10",
                                            years_operating == 3 ~ "3. 11-20",
                                            years_operating == 4 ~ "4. 20+")) %>%
         filter(is.na(years_operating) == FALSE) %>%
         group_by(years_operating) %>%
         summarize(count=n(),
           mean_promotion = round(mean(promotion, na.rm = TRUE),
                                                digits = 2))%>%
         ungroup() %>%
  kable() %>%
  kable_styling()
years_operating count mean_promotion
  1. 0-3
13 2.92
  1. 4-10
37 2.65
  1. 11-20
23 3.00
  1. 20+
34 3.32

3 Interdisciplinarity Scores

Roughly 68% of venues have 2+ program uses. This is relatively high, but not the highest we have seen.

This trend holds up for all the size categories (didn’t make this chart but I looked into it - it’s true)

Sources: CFP

#either cut or perhaps change to a bar chart
main_venue_data %>%
  filter(city == "Copenhagen") %>%
  #mutate(venueType_cinema = as.numeric(ifelse(venueType_cinema == "0", 0, NA))) %>%
  #mutate_at(vars(venueType_disco:venueType_studio), as.numeric) %>%
  mutate(number_of_uses = rowSums(across(venueType_disco:venueType_studio))) %>%
  group_by(number_of_uses) %>%
  tally() %>%
  mutate(percentage = round(100*(n/sum(n)), digits = 2)) %>%
  kable() %>%
  kable_styling()
number_of_uses n percentage
0 4 3.70
1 31 28.70
2 40 37.04
3 17 15.74
4 11 10.19
5 3 2.78
6 1 0.93
7 1 0.93

3.1 Interdisciplinarity Content

Programming ratings are, on average, lower for single-use venues. This aligns with observed trends in other cities.

main_venue_data %>% 
    filter(city == "Copenhagen") %>% 
  mutate(number_of_uses = rowSums(across(venueType_disco:venueType_studio))) %>%
  mutate(multi_use = ifelse(number_of_uses > 1, "Multi-Use", "Single-Use")) %>%
    select(multi_use, experimentation, creative_output, 
           community_focus, promotion)%>%
    gather(-multi_use, key = "variable", value = "value") %>%
    filter(value != 0) %>%
    mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                                variable == "experimentation" ~ "Experimentation",
                                variable == "creative_output" ~ "Creative Output",
                                variable == "promotion" ~ "Promotion")) %>%
    ggplot()+ 
    geom_histogram(aes(value), position = "dodge", binwidth = 1, alpha = 0.6, fill = CityPalette[8]) + 
    facet_grid(multi_use~variable)+ 
    labs(
        title = "Distribution of Programming Variables",
        subtitle = "",
        caption = "Data: CFP")+
    ylab("Number of venues")+
    xlab("Rating (1-4)")+
    plotTheme

main_venue_data %>% 
    filter(city %in% c("Tokyo", "Stockholm", "Montreal", "Sydney", "Rotterdam", "Copenhagen")) %>% 
  mutate(number_of_uses = rowSums(across(venueType_disco:venueType_studio ))) %>%
  mutate(multi_use = ifelse(number_of_uses > 1, "Multi-Use", "Single-Use")) %>%
    select(multi_use, experimentation, creative_output, 
           community_focus, promotion, city)%>%
    gather(-multi_use, -city, key = "variable", value = "value") %>%
    filter(value != 0) %>%
    mutate(variable = case_when(variable == "community_focus" ~ "Community Focus",
                                variable == "experimentation" ~ "Experimentation",
                                variable == "creative_output" ~ "Creative Output",
                                variable == "promotion" ~ "Promotion")) %>%
          group_by(variable, city, multi_use) %>%
  summarize(mean_value = round(mean(value, na.rm = TRUE), digits = 2)) %>%
    ggplot()+ 
    geom_bar(aes(x= variable, y=mean_value, fill = multi_use), stat = "identity", position = "dodge", alpha = 0.6) + 
    facet_wrap(~city)+ 
    labs(
        title = "Distribution of Programming Variables",
        subtitle = "",
        caption = "Data: CFP")+
    ylab("Mean Score")+
    xlab("Programming Variable")+
    plotTheme

3.2. Multi-Use Space Analysis

What are the most common combinations of uses?

What is going on with Warehouse here?

main_venue_data %>% 
  filter(city == "Copenhagen") %>% 
  select(matches("venueType")) %>%
    mutate(across(everything(), ~ ifelse(. == 1, cur_column(), ""))) %>% 
    mutate(across(everything(), ~ str_replace(., "venueType_", ""))) %>%
    unite("use_combo", everything(), sep = ",") %>%
  mutate(use_combo = toupper(use_combo)) %>%
  group_by(use_combo) %>%
  tally() %>%
  arrange(-n) %>%
  kable() %>%
  kable_styling()
use_combo n
,,,,,,,,,WAREHOUSE,,, 7
,,,,,,MUSICBAR,,,,,, 7
,,,CONCERTHALL,,,,,,,,, 7
,,,CONCERTHALL,,,,,,WAREHOUSE,,, 5
,DISCO,,,,,,,,,,, 5
CLUB,,,,,,,,,,,, 4
,,,,,,,,,WAREHOUSE,,,STUDIO 3
,,,,,,MUSICBAR,,RESTAURANT,,,, 3
,,OPENAIR,CONCERTHALL,,,,,,,,, 3
,,,,,,,,RESTAURANT,WAREHOUSE,,, 2
,,,,,,,GALLERY,RESTAURANT,,,, 2
,,,CONCERTHALL,,,,,,,ARENA,, 2
,,,CONCERTHALL,,,,,,WAREHOUSE,,,STUDIO 2
,,,CONCERTHALL,,,,,,WAREHOUSE,ARENA,, 2
,,,CONCERTHALL,,,,,RESTAURANT,WAREHOUSE,,,STUDIO 2
,,,CONCERTHALL,,,MUSICBAR,,,,,, 2
,,,CONCERTHALL,THEATER,,,,,,,, 2
,,OPENAIR,,,,,,,WAREHOUSE,,, 2
CLUB,,,CONCERTHALL,,,,,,,,, 2
CLUB,,,CONCERTHALL,,,,,,,,,STUDIO 2
,,,,,,,,,,,SHOP,STUDIO 1
,,,,,,,,RESTAURANT,,,, 1
,,,,,,,,RESTAURANT,,,,STUDIO 1
,,,,,,,,RESTAURANT,WAREHOUSE,,SHOP,STUDIO 1
,,,,,,,GALLERY,,,,,STUDIO 1
,,,,,,,GALLERY,,WAREHOUSE,,, 1
,,,,,,MUSICBAR,,RESTAURANT,,,SHOP, 1
,,,CONCERTHALL,,,,,,,,,STUDIO 1
,,,CONCERTHALL,,,,GALLERY,,WAREHOUSE,,, 1
,,,CONCERTHALL,,,,GALLERY,,WAREHOUSE,,,STUDIO 1
,,,CONCERTHALL,,,MUSICBAR,,,WAREHOUSE,,, 1
,,,CONCERTHALL,,,MUSICBAR,,RESTAURANT,WAREHOUSE,,, 1
,,,CONCERTHALL,,CINEMA,,,,WAREHOUSE,,, 1
,,,CONCERTHALL,,CINEMA,MUSICBAR,,RESTAURANT,WAREHOUSE,,,STUDIO 1
,,,CONCERTHALL,THEATER,,,,RESTAURANT,WAREHOUSE,,, 1
,,,CONCERTHALL,THEATER,CINEMA,,,,WAREHOUSE,,, 1
,,OPENAIR,,,,,,,,,, 1
,,OPENAIR,,,,,,RESTAURANT,,,, 1
,,OPENAIR,,,,,,RESTAURANT,WAREHOUSE,,, 1
,,OPENAIR,CONCERTHALL,,,,,,,,,STUDIO 1
,,OPENAIR,CONCERTHALL,,,,,,WAREHOUSE,,, 1
,,OPENAIR,CONCERTHALL,,,,,,WAREHOUSE,,,STUDIO 1
,,OPENAIR,CONCERTHALL,,,,,RESTAURANT,WAREHOUSE,,, 1
,,OPENAIR,CONCERTHALL,,,,,RESTAURANT,WAREHOUSE,,,STUDIO 1
,,OPENAIR,CONCERTHALL,,,MUSICBAR,,,WAREHOUSE,,SHOP, 1
,,OPENAIR,CONCERTHALL,,,MUSICBAR,,RESTAURANT,,,, 1
,,OPENAIR,CONCERTHALL,,,MUSICBAR,,RESTAURANT,WAREHOUSE,,, 1
,,OPENAIR,CONCERTHALL,,,MUSICBAR,,RESTAURANT,WAREHOUSE,,SHOP,STUDIO 1
,DISCO,,,,,,,,WAREHOUSE,,, 1
,DISCO,,,,,,,RESTAURANT,WAREHOUSE,,, 1
,DISCO,,CONCERTHALL,,,,,,WAREHOUSE,,, 1
,DISCO,,CONCERTHALL,,,,,,WAREHOUSE,,,STUDIO 1
,DISCO,,CONCERTHALL,,,MUSICBAR,,,,,, 1
CLUB,,,,,,,,,WAREHOUSE,,, 1
CLUB,,,CONCERTHALL,,,,,,WAREHOUSE,,, 1
CLUB,,,CONCERTHALL,,,,,,WAREHOUSE,,,STUDIO 1
CLUB,,,CONCERTHALL,,,,,RESTAURANT,,,, 1
CLUB,,,CONCERTHALL,,,,,RESTAURANT,WAREHOUSE,,, 1
CLUB,,,CONCERTHALL,,,,GALLERY,,,,, 1
CLUB,,OPENAIR,,,,,,,WAREHOUSE,,, 1
CLUB,,OPENAIR,,,,MUSICBAR,,,,,, 1
CLUB,,OPENAIR,CONCERTHALL,,,,,,WAREHOUSE,,, 1

3.2.1. Uses by District

The multi-use phenomenon is fairly widespread, and there isn’t a clear pattern because some of the districts have so few venues, it’s a bit hard to say if there’s a rent-driven trend.

d2_aggregates %>%
  filter(is.na(venue_count) == FALSE) %>%
  rowwise() %>%
  mutate(venueType_sum = sum(c_across(starts_with("venueType_")))) %>%
  ungroup() %>%
  mutate(mean_uses = venueType_sum/venue_count) %>%
  ggplot()+
  geom_bar(aes(x = d2_name, y = mean_uses), stat = "identity")+
  labs(
    title = "Mean space uses by district",
    subtitle = "",
    x="District",
    y="Mean Number of Space Uses",
    caption = "Data: CFP,  City of Copenhagen")+
  plotTheme

d2_aggregates %>%
  filter(is.na(venue_count) == FALSE) %>%
  rowwise() %>%
  mutate(venueType_sum = sum(c_across(starts_with("venueType_")))) %>%
  ungroup() %>%
  mutate(mean_uses = venueType_sum/venue_count) %>%
  ggplot()+
  geom_point(aes(y = mean_uses, x = as.numeric(rent_t2)/1000))+
  geom_label(aes(y = mean_uses, x = as.numeric(rent_t2)/1000, label = d2_name))+
  ylim(0, 5)+
  xlim(0, 70)+
  geom_smooth(aes(x = as.numeric(rent_t2)/1000, y = mean_uses),
                 alpha=0.1,
                 weight = 0.5, color = "red", 
                 se = FALSE,
                 method = "lm") +
    labs(
    title = "Mean space uses by district as a function mean property market value",
    subtitle = "Mean market values for all structure types, 2022",
    x="DKK/sqm - 2022",
    y="Mean Number of Space Uses",
    caption = "Data: CFP,  Finans Danmark, City of Copenhagen")+
  plotTheme

3.2.2 Venue Use Summary

Concert hall and warehouse are the most common uses, and are very over-represented in the sample data set. Copenhagen also has a relatively high percentage of open-airs and studio spaces.

The “Quotient” in the plot below represents the ratio at which these uses are represented in the sample city compared to CFP cities Tokyo, Stockholm, Montreal, Sydney and Rotterdam - the cities where we have been using our current protocol to measure use.

main_venue_data %>% 
   filter(city %in% c("Copenhagen" )) %>%
 select(contains("venueType")) %>%
     pivot_longer(
         cols = starts_with("venueType_"), 
         names_to = "venueType",
         values_to = "count") %>%
   group_by(venueType)%>%
   summarize(n = sum(count)) %>%
   mutate(local_pct = round(100*(n/nrow(main_venue_data %>% 
   filter(city %in% c("Copenhagen" )))), digits = 2)) %>%
  cbind(., main_venue_data %>% 
   filter(city %in% c("Tokyo", "Stockholm", "Montreal", "Sydney", "Rotterdam")) %>%
 select(contains("venueType")) %>%
     pivot_longer(
         cols = starts_with("venueType_"), 
         names_to = "venueType",
         values_to = "count") %>%
   group_by(venueType)%>%
   summarize(n = sum(count, na.rm = TRUE)) %>%
   mutate(cfp_pct = round(100*(n/nrow(main_venue_data %>% 
   filter(city %in% c("Tokyo", "Stockholm", "Montreal", "Sydney", "Rotterdam")))), digits = 2)) %>%
   select(cfp_pct)) %>%
  mutate(quotient = local_pct / cfp_pct)
##                venueType  n local_pct cfp_pct  quotient
## 1        venueType_arena  4      3.70    1.42 2.6056338
## 2       venueType_cinema  3      2.78    0.79 3.5189873
## 3         venueType_club 17     15.74   15.80 0.9962025
## 4  venueType_concertHall 58     53.70   29.23 1.8371536
## 5        venueType_disco 10      9.26    7.50 1.2346667
## 6      venueType_gallery  7      6.48    3.40 1.9058824
## 7     venueType_musicBar 22     20.37   38.63 0.5273104
## 8      venueType_openAir 20     18.52    3.71 4.9919137
## 9   venueType_restaurant 26     24.07   30.17 0.7978124
## 10        venueType_shop  5      4.63    1.58 2.9303797
## 11      venueType_studio 22     20.37    3.00 6.7900000
## 12     venueType_theater  4      3.70    6.64 0.5572289
## 13   venueType_warehouse 53     49.07   10.90 4.5018349

3.3 Multi-Usage City Comparison

Copenhagen has a very large number of multi-use spaces. On average, spaces are used for 2 different programming uses.

main_venue_data %>%
  filter(city != "",
         city %in% c("Tokyo", "Stockholm", "Montreal", "Sydney", "Rotterdam", "Copenhagen")) %>%
  mutate(number_of_uses = rowSums(across(venueType_disco:venueType_studio)))%>%
  group_by(number_of_uses, city) %>%
  tally() %>%
  group_by(city) %>%
  mutate(pct = round(100*(n/sum(n)), digits = 2)) %>%
  ggplot()+
  geom_bar(aes(y = pct, x = number_of_uses, fill = city), 
           stat = "identity", alpha = 0.6) +
  scale_fill_manual(values = c(CityPalette[8], CityPalette[5] , CityPalette[7], CityPalette[3], CityPalette[6], CityPalette[4]))+
  theme(legend.direction = "horizontal", legend.position = "bottom")+
  facet_wrap(~city)+
  labs(
    title = "Number of programming or types of use in creative spaces",
    subtitle = "Subtitle",
    x="Number of Space Usages",
    y="Percentage of Venues",
    fill = "CFP City",
    caption = "Data: CFP")+
  plotTheme

main_venue_data %>%
  filter(city != "",
         city %in% c("Tokyo", "Stockholm", "Montreal", "Sydney", "Rotterdam", "Copenhagen")) %>%
  mutate(number_of_uses = rowSums(across(venueType_disco:venueType_studio)))%>%
  group_by(city) %>%
  summarize(mean_num_uses = mean(number_of_uses),
            median_num_uses = median(number_of_uses)) %>%
  group_by(city) %>%
  kable() %>%
  kable_styling()
city mean_num_uses median_num_uses
Copenhagen 2.166667 2
Montreal 1.756458 2
Rotterdam 2.380282 2
Stockholm 1.088235 1
Sydney 1.630705 2
Tokyo 1.006885 1

4 Closures

The data on closures that was part of the CFP during and right after COVID is no longer part of the analysis.

Sources: CFP

4.1 Closures by Programming

4.2 Closures Map

5 Districts

Sources: CFP, Centraal Bureau voor de Statistiek

5.0. Dynamic map of districts

Hover over the shapes to see a pop-up icon with some key information about the districts.

Information appended “t_2” corresponds to 2022 data, and “t_1” to 2017.

d2_aggregates %>%
  mutate(venue_kmsq = venue_count / area_km2) %>%
  select(mean_creative_output, mean_experimentation, mean_community_focus, mean_promotion, d2_id, d2_name,
         venue_count, venue_kmsq, station_count, transit_stations_per_km2, area_km2, pop_per_km2_t2, rent_t2, income_t2, change_rent) %>%
  mapview(., zcol = "venue_count")

5.1 Number of venues per district map

Venues most concentrated in Indre By, but there are notable numbers of venues in 3-4 other districts.

These maps will need to be styled in illustrator to deal with labels.

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(venue_count) == FALSE),
          aes(fill = venue_count),
          color = 'grey', alpha = 0.6)+
  scale_fill_viridis('Venue Count', direction = -1, alpha = 0.8)+
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
  labs(
    title = "Venues Count by District",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

5.2 Number of venues per district chart

d2_aggregates %>%
            filter(venue_count != 0) %>%
  ggplot()+
  geom_bar(aes(x = reorder(toupper(d2_name), venue_count), y = venue_count), 
           stat = "identity", fill = CityPalette[8], alpha = 0.6, size = 4 )+
  labs(
    title = "Venues by District",
    #subtitle = "NUMBER districts have no reported venues",
    x="",
    y="Total Venues",
    #fill = "CFP City",
    caption = "Data: CFP, City of Copenhagen")+
  coord_flip()+
  plotTheme

#ggsave("rotterdam_images/venues_district_bar.pdf", width = 11, height = 8, units = "in")

5.3 Venue Density by District - Symbology

The central districts of Indre By and Norrebro have the highest venue density.

temp <- d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2) %>%
            filter(is.na(venue_count) == FALSE)
  

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            mutate(venue_dens = venue_count / area_km2) %>%
            filter(is.na(venue_count) == FALSE),
          aes(fill = venue_dens),
          color = 'grey')+
  scale_fill_viridis('Venue Density\n Venues/km^2', direction = -1, 
                     #limits=c(0,32), 
                     #breaks=c(0,32), 
                     alpha = 0.6,
                     #labels=c("Minimum","Maximum"),
                     breaks = c(quantile(temp$venue_dens, probs = seq(0, 1, length.out = 5))),
                     labels = scales::number_format(scale = 1))+
  geom_text(data = d2_aggregates %>%
              filter(venue_count !=0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
  labs(
    title = "Venues Density by District",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

#ggsave("rotterdam_images/venues_district_chloropleth.pdf", width = 11, height = 8, units = "in")

5.4 Venue Density - Heatmap

# Heat map

# Make a fishnet

fishnet <- st_make_grid(d2_aggregates %>%
                          st_transform(crs = 3035), 
                        cellsize = 500) %>% #size in meters
  st_sf() %>%
  st_transform(crs = 4326)

# If this fails, run the following line of code:
# sf_use_s2(FALSE)

fishnet <- fishnet[d2_aggregates %>%
                    st_transform(crs = 4326),] %>%
  mutate(uniqueID = rownames(.)) %>%
  select(uniqueID) %>%
  mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
         lat=map_dbl(geometry, ~st_centroid(.x)[[2]]))

# Join the fishet to the points

fishnet <- st_join(fishnet, 
                main_venue_data %>%
                  filter(city == "Copenhagen",
                         is.na(uid) == FALSE,
                         is.na(x)== FALSE,
                         is.na(y)== FALSE) %>%
                  st_as_sf(coords = c("x", "y"), crs = 4326), 
                join = st_intersects, 
                left = TRUE) %>%
  select(uniqueID, uid) %>%
  group_by(uniqueID) %>% 
  summarise(n_venues = n_distinct(uid, na.rm = TRUE)) %>%
  #mutate(n = ifelse(n == 1, 0, n)) %>%
  #rename(n_venues = n) %>%
  as.data.frame() %>%
  select(-geometry) %>%
  left_join(fishnet, .)
# Make sure you set the labels here to reflect the actual max density by cell

max(fishnet$n_venues)
## [1] 6
ggplot()+
  geom_sf(data = d2_aggregates,
          fill = '#f0f0f0', 
          color = 'white')+
  geom_sf(data = fishnet %>%
            filter(n_venues >0), 
          aes(fill = n_venues),
          color = "transparent",
          alpha = 0.4)+
  #geom_text(data = d1_aggregates %>%
  #            filter(city_en == "Montreal",
  #                   venue_count !=0) %>%
  #            mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
  #                   lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
  #          aes(x=lon, y=lat, label=d1_name), size = 3, alpha = 0.4)+
  scale_fill_viridis('VENUES/\nkm^2', direction = -1,
                     limits=c(1,max(fishnet$n_venues)), breaks=c(1, max(fishnet$n_venues)),
                     labels=c("1",max(fishnet$n_venues)))+
  labs(
    title = "VENUE DENSITY",
    subtitle = "",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

#ggsave("rotterdam_images/venues_district_heatmap.pdf", width = 11, height = 8, units = "in")

5.5 Venue Density with Experimental Content

The districts in the nearby north and west of Indre By have the highest experimental content ratings.

ggplot()+
  geom_sf(data = d2_aggregates,
          fill = '#f0f0f0', 
          color = 'white')+
  geom_point(data = d2_aggregates %>%
               mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                      lat=map_dbl(geometry, ~st_centroid(.x)[[2]])), 
             aes(x=lon, y=lat, size = venue_count, 
                 color = mean_experimentation), 
             alpha = 0.4) + 
  scale_color_viridis('Mean Experimental\nLikelihood Score', direction = -1,
                      limits = c(1, 4),
                      breaks = c(1,2, 3, 4),
                      labels = c("Not At All", "Not Very", "Somewhat", "Extremely"))+
  scale_size_area(name="", max_size = 25, guide = 'none') + 
  #guides(size=guide_legend("Total Venues")) +
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x=lon, y=lat, label=d2_name), size = 3, alpha = 0.65)+
  labs(
    title = "Creative Footprint Venue",
    subtitle = "Icon sizes represent total number of venues per district.",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

5.6 Districts and Experimental Content

# Top wards for experimental content

ggplot(data = d2_aggregates %>%
         filter(is.na(venue_count) == FALSE) %>%
         as.data.frame() %>%
         mutate(venue_text = str_c(as.character(venue_count), " venues")) %>%
         select(venue_text, venue_count, mean_experimentation, d2_name) %>%
         arrange(-mean_experimentation) %>%
         top_n(10))+
  geom_bar(aes(x = reorder(toupper(d2_name), mean_experimentation), 
               y=mean_experimentation),
           stat = "identity", fill = CityPalette[8], width = 0.3,
           alpha = 0.5)+
  scale_y_continuous(breaks = 1:4, 
                     labels = c("Not At All", "Not Too", "Somewhat", "Very"),
                     expand = c(0, 1))+
  geom_text(aes(label = toupper(venue_text), x = toupper(d2_name), 
                y= mean_experimentation / 2), alpha = 0.6, size = 4 )+
  labs(
    title = "DISTRICTS WITH HIGHEST EXPERIMENTAL CONTENT SCORES",
    #subtitle = "Labels indicate total number of districts in that ward",
    caption = "Data: CFP, City of Copenhagen")+
  ylab("Mean Likelihood of Experimental Program")+
  xlab("")+
  coord_flip()+
  plotTheme

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(venue_count) == FALSE),
          aes(fill = mean_experimentation),
          color = 'grey', alpha = 0.6)+
  scale_fill_viridis('Mean Experimental\nLikelihood Score', direction = -1,
                      limits = c(1, 4),
                      breaks = c(1,2, 3, 4),
                      labels = c("Not At All", "Not Very", "Somewhat", "Extremely"))+
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
  labs(
    title = "Experimental Content Scores By District",
    subtitle = "",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

#ggsave("rotterdam_images/experimental_districts_chloropleth.pdf", width = 11, height = 8, units = "in")

5.7 Venue Density with Community Focused Content

The average scores in the Centrum are not substantially poorer than elsewhere, but they are lower, on average. When we do the clustering analysis, we see that there is some heterogenaeity in the Centrum.

ggplot()+
  geom_sf(data = d2_aggregates,
          fill = '#f0f0f0', 
          color = 'white')+
  geom_point(data = d2_aggregates %>%
               mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                      lat=map_dbl(geometry, ~st_centroid(.x)[[2]])), 
             aes(x=lon, y=lat, size = venue_count, 
                 color = mean_community_focus), 
             alpha = 0.4) + 
  scale_color_viridis('Mean Community Focus\nLikelihood Score', direction = -1,
                      limits = c(1, 4),
                      breaks = c(1,2, 3, 4),
                      labels = c("Not At All", "Not Very", "Somewhat", "Extremely"))+
  scale_size_area(name="", max_size = 25, guide = 'none') + 
  #guides(size=guide_legend("Total Venues")) +
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x=lon, y=lat, label=d2_name), size = 3, alpha = 0.65)+
  labs(
    title = "Creative Footprint Venues",
    subtitle = "Icons represent total number of venues per district",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(venue_count) == FALSE),
          aes(fill = mean_community_focus),
          color = 'grey', alpha = 0.6)+
  scale_fill_viridis('Mean Community Focus\nLikelihood Score', direction = -1,
                      limits = c(1, 4),
                      breaks = c(1,2, 3, 4),
                      labels = c("Not At All", "Not Very", "Somewhat", "Extremely"))+
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
    labs(
    title = "Community Focus Scores By District",
    subtitle = "",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

#ggsave("rotterdam_images/community_districts_chloropleth.pdf", width = 11, height = 8, units = "in")
# Top wards for community content

ggplot(data = d2_aggregates %>%
         filter(is.na(mean_community_focus) == FALSE) %>%
         as.data.frame() %>%
         mutate(venue_text = str_c(as.character(venue_count), " venues")) %>%
         select(venue_text, venue_count, mean_community_focus, d2_name) %>%
         arrange(-mean_community_focus) %>%
         top_n(10))+
  geom_bar(aes(x = reorder(toupper(d2_name), mean_community_focus), 
               y=mean_community_focus),
           stat = "identity", fill = CityPalette[6], width = 0.3,
           alpha = 0.8)+
  scale_y_continuous(breaks = 1:4, 
                     labels = c("Not At All", "Not Too", "Somewhat", "Very"),
                     expand = c(0, 1))+
  geom_text(aes(label = toupper(venue_text), x = toupper(d2_name), 
                y= mean_community_focus / 2), alpha = 0.6, size = 4 )+
  labs(
    title = "Districts with highest community likelihood scores",
    #subtitle = "Labels indicate total number of districts in that ward",
    caption = "Data: CFP, City of Copenhagen")+
  ylab("Mean Likelihood of Community-Oriented Program")+
  xlab("")+
  coord_flip()+
  plotTheme

5.8 Venue Density with Creative Content

There are notably high creativity scores all across the city, including in Indre By. Norrebro jumps off the chart once again.

ggplot()+
  geom_sf(data = d2_aggregates,
          fill = '#f0f0f0', 
          color = 'white')+
  geom_point(data = d2_aggregates %>%
               mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                      lat=map_dbl(geometry, ~st_centroid(.x)[[2]])), 
             aes(x=lon, y=lat, size = venue_count, 
                 color = mean_creative_output), 
             alpha = 0.4) + 
  scale_color_viridis('Mean Creative Content\nLikelihood Score', direction = -1,
                        limits = c(1, 4),
                        breaks = c(1,2, 3, 4),
                        labels = c("Not At All", "Not Very", "Somewhat", "Extremely"))+
  scale_size_area(name="", max_size = 25, guide = 'none') + 
  #guides(size=guide_legend("Total Venues")) +
  geom_text(data = d2_aggregates %>%
              filter(venue_count != 0) %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x=lon, y=lat, label=d2_name), size = 3, alpha = 0.65)+
  labs(
    title = "Creative Footprint Venues",
    subtitle = "Icons represent total number of venues per district.",
    caption = "Data: CFP, City of Copenhagen")+
  mapTheme

# Top for community content

ggplot(data = d2_aggregates %>%
         filter(is.na(mean_creative_output) == FALSE) %>%
         as.data.frame() %>%
         mutate(venue_text = str_c(as.character(venue_count), " venues")) %>%
         select(venue_text, venue_count, mean_creative_output, d2_name) %>%
         arrange(-mean_creative_output) %>%
         top_n(10))+
  geom_bar(aes(x = reorder(toupper(d2_name), mean_creative_output), 
               y=mean_creative_output),
           stat = "identity", fill = CityPalette[6], width = 0.3,
           alpha = 0.8)+
  scale_y_continuous(breaks = 1:4, 
                     labels = c("Not At All", "Not Too", "Somewhat", "Very"),
                     expand = c(0, 1))+
  geom_text(aes(label = toupper(venue_text), x = toupper(d2_name), 
                y= mean_creative_output / 2), alpha = 0.6, size = 4 )+
  labs(
    title = "Districts with highest creative likelihood scores",
    #subtitle = "Labels indicate total number of districts in that ward",
    caption = "Data: CFP, City of Copenhagen")+
  ylab("Mean Creative Program Likelihood Rating")+
  xlab("")+
  coord_flip()+
  plotTheme

6 Urban Variables

Sources: CFP, City of Copenhagen, Finans Danmark

6.1 Urban Variable Maps

6.1.1. Rents

Rents are higher in the central areas - with Indre By and neighboring districts reporting the highest median rents.

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(rent_t2) == FALSE),
          aes(fill = as.numeric(rent_t2)/1000),
          color = 'grey')+
  scale_fill_viridis('DKK / sqm', direction = -1, alpha = 0.8)+
  geom_text(data = d2_aggregates %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
            aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
labs(
    title = "Mean reported property prices",
    subtitle = "DKK/sqm, for all types of dwellings, 2022.",
    caption = "Data: Finans Danmark, City of Copenhagen")+
  mapTheme

6.1.2. Change in Property Values

Property values are appreciating slighlty faster in the eastern districts. This growth is not nearly as dramatic in Copenhagen than in other recent CFP cities such as Rotterdam.

Based on 2017-2022 figures, adjusted to 2022 rates using the following CPI adjustment:

https://www.global-rates.com/en/calculations/inflation-calculator/#amount

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(change_rent) == FALSE),
          aes(fill = 100*(as.numeric(change_rent) / (.93*as.numeric(rent_t1)))),
          color = 'grey')+
  scale_fill_viridis('Percentage Change', direction = -1, alpha = 0.8)+
  geom_text(data = d2_aggregates %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
           aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
labs(
    title = "Change in mean reported property prices",
    subtitle = "For all types of dwellings, 2017-2022, inflation adjusted",
    caption = "Data: Finans Danmark, City of Copenhagen")+
  mapTheme

6.1.3. Income

It’s notable that income is somewhat lower in some venue rich areas (e.g. Norrebro) and higher in others (Indre By) - there are venue clusters everywhere!

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(income_t2) == FALSE),
          aes(fill = as.numeric(income_t2)),
          color = 'grey')+
  scale_fill_viridis('Income', direction = -1, alpha = 0.8)+
  geom_text(data = d2_aggregates %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
           aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
labs(
    title = "Mean Personal Income",
    subtitle = "Personal income in total (ex. imputed rent and before deductions of interest expenses), DKK, all persons 14+, 2022",
    caption = "Data: City of Copenhagen")+
  mapTheme

6.1.4. Change in Income

There is an inflation adjusted rise in overall incomes in Copenhagen. Property values are lagging somewhat behind incomes as an indicator of change.

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(change_income) == FALSE),
          aes(fill = 100*(as.numeric(change_income) / (.92*as.numeric(income_t1)))),
          color = 'grey')+
  scale_fill_viridis('Percentage', direction = -1, alpha = 0.8)+
  geom_text(data = d2_aggregates %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
           aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
labs(
    title = "Percentage Change in Mean Income, 2017-2022",
    subtitle = "Adjusted for inflation",
    caption = "Data: City of Copenhagen")+
  mapTheme

6.2. Transit Density

There is fairly widespread fixed transit access across the city - most concentrated in areas surrounding Frederiksberg. Based on testimony in focus groups, it seems clear that multi-modal transport is more both important and highly valued in Copenhagen.

Transit data includes only locations of fixed Metro and S train stops, not busses, trams etc.,

Source: City of Copenhagen, OpenData DK, Rezk (2020)

ggplot()+
  geom_sf(data = d2_aggregates, fill = 'transparent', 
          color = 'grey')+
  geom_sf(data = d2_aggregates %>%
            filter(is.na(transit_stations_per_km2) == FALSE),
          aes(fill = transit_stations_per_km2),
          color = 'grey')+
  scale_fill_viridis('Stations/km^2', direction = -1, alpha = 0.8)+
 geom_text(data = d2_aggregates %>%
              mutate(lon=map_dbl(geometry, ~st_centroid(.x)[[1]]),
                     lat=map_dbl(geometry, ~st_centroid(.x)[[2]])),
           aes(x = lon, y = lat, label = d2_name),
            color = "black", size = 3)+
labs(
    title = "Transit Density",
    subtitle = "Fixed Rail Stations / km^2 - Metro and S stations only",
    caption = "Data: City of Copenhagen, OpenData DK, Rezk (2020)")+
  mapTheme

6.2 Venue density as a function of transit density

General trends across CFP cities show a positive relationship between venue density and fixed transit density. Even in auto-oriented cities Sydney and Montreal, this trend generally holds (First plot). Rotterdam has a strong correlation - as the Centrum has far and away the highest transit density and the highest venue density in the city. (Third plot).

ggplot(d1_aggregates %>%
         as.data.frame() %>%
         filter(city_en %in% c("Tokyo", "Berlin", "Montreal", "Stockholm")) %>%
         filter(! d1_name %in% remove_arrondissements) %>%
         mutate(venue_dens = venue_count / area_km2,
                venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
                city  = case_when(city_en == "Berlin" ~ "1. BERLIN, 2017",
                                  city_en == "Tokyo" ~ "3. TOKYO, 2019",
                                  city_en == "Stockholm" ~ "4. STOCKHOLM, 2021",
                                  city_en == "Montreal" ~ "5. MONTREAL, 2022")) %>%
         select(city, venue_dens, transit_stations_per_km2) %>%
         rbind(., d2_aggregates_all %>%
                 as.data.frame() %>%
         filter(city_en %in% c("Sydney", "New York", "Rotterdam")) %>%
         mutate(venue_dens = venue_count / area_km2,
                venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
                city  = case_when(city_en == "New York" ~ "2. NEW YORK, 2018",
                                  city_en == "Sydney" ~ "6. SYDNEY, 2023",
                                  city_en == "Rotterdam" ~ "7. ROTTERDAM, 2024")) %>%
           select(city, venue_dens, transit_stations_per_km2)) %>%
           rbind(., d2_aggregates %>%
                   as.data.frame() %>%
         mutate(venue_dens = venue_count / area_km2,
                venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
                city  = "8. COPENHAGEN, 2024") %>%
         select(city, venue_dens, transit_stations_per_km2))) + 
  geom_point(aes(x = transit_stations_per_km2, y = venue_dens, color = city),
             size = 1, alpha = 0.6)+
  geom_hline(aes(yintercept = median(d1_aggregates$venue_count / d1_aggregates$area_km, na.rm = TRUE)))+
  geom_vline(aes(xintercept = median(d1_aggregates$transit_stations_per_km2, na.rm = TRUE)))+
  scale_color_manual(values = c(CityPalette[1], CityPalette[2], CityPalette[3], CityPalette[4], CityPalette[5], CityPalette[6], CityPalette[7], CityPalette[8]))+
 # scale_x_continuous(breaks = c(0, median(d1_aggregates$transit_stations_per_km2, na.rm = TRUE),
#                                2.5, max(d1_aggregates$transit_stations_per_km2, na.rm = TRUE)), 
#                     labels = c("NONE", "MEDIAN", "HIGH", "MAXIMUM"))+
 # scale_y_continuous(breaks = c(0, median(d1_aggregates$venue_count / d1_aggregates$area_km, na.rm = TRUE),
  #                              5.5, max(d1_aggregates$venue_count / d1_aggregates$area_km, na.rm = TRUE)), 
  #                   labels = c("NONE", "MEDIAN", "HIGH", "MAXIMUM"))+
  geom_smooth(aes(x = transit_stations_per_km2, y = venue_dens),
                 alpha=0.1,
                 weight = 0.5, color = "light grey", linetype="dashed",
                 se = FALSE,
                 method = "lm") +
  labs(
    title = "VENUE DENSITY AND TRANSIT DENSITY GO TOGETHER",
    subtitle = "EACH POINT REPRESENTS NTA (NY), WARD (TOKYO), DISTRICT (SWE), \nARRONDISSEMENT (MON), SA2 (SYDNEY), GEBIEDEN (ROT), DISTRICT (COP), OR BEZIRK (BER)\nDOTTED LINE REPRESENTS BEST FIT TREND.",
    color = "CFP CITY",
    caption = "Data: CFP, Centraal Bureau voor de Statistiek, Montreal Open Data, estat.go.jp, US Census Bureau, City of New York, \nGeodaten aus Deutchland, daten.berlin.de, statistikdatabasen.scb.se\nOpen Data Montreal, Statistics Canada, New South Wales Open Data, Australian Bureau of Statistics, Centraal Bureau voor de Statistiek, City of Copenhagen, Resk (2020)")+
  ylab("VENUE DENSITY")+
  xlab("RAIL DENSITY")+
  plotTheme

d2_aggregates %>%
  mutate(has_venues = ifelse(is.na(venue_count) == TRUE, "No Venues", "Has Venues")) %>%
  as.data.frame() %>%
ggplot()+
  geom_histogram(aes(transit_stations_per_km2), 
                 binwidth = .5, alpha = 0.6, fill = CityPalette[6])+
  facet_wrap(~has_venues)+
  labs(
    title = "Fixed Transit Station Density by District",
    subtitle = "Areas with venues don't necessarily have access to transit",
    caption = "Data: Centraal Bureau voor de Statistiek, ArcGIS Online Feature")+
  ylab("n districts")+
  xlab("Fixed Rail Transit Stops per km by District")+
  plotTheme
d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2) %>%
  ggplot()+
  geom_point(aes(x = transit_stations_per_km2, y = venue_dens))+
  geom_text(aes(x = transit_stations_per_km2, y = venue_dens, label = d2_name), 
            nudge_x = .1, nudge_y = 0.1, alpha = 0.2, size = 3) +
  geom_smooth(aes(x = transit_stations_per_km2, y = venue_dens),
              alpha=0.1,
              weight = 0.5, color = "light grey", linetype="dashed",
              se = FALSE,
              method = "lm")+
  labs(
    title = "Venue Density by Fixed Transit Station Density by District",
    subtitle = "",
    caption = "Data: City of Copenhagen, OpenData DK, Resk (2020)")+
  ylab("Venue Density")+
  xlab("Rail Density")+
  plotTheme

6.2.2 Relationships Between Transit and Rents

d2_aggregates %>%
  ggplot()+
  geom_point(aes(x = transit_stations_per_km2, y = as.numeric(rent_t2)/1000),
             size = 2, alpha = 0.6, fill = CityPalette[8])+
  geom_smooth(aes(x = transit_stations_per_km2, y = as.numeric(rent_t2)/1000),
              alpha=0.1,
              weight = 0.5, color = "light grey", linetype="dashed",
              se = FALSE,
              method = "lm")+
  geom_text(aes(x = transit_stations_per_km2, y = as.numeric(rent_t2)/1000, label = d2_name),  
            nudge_x = .1, nudge_y = 0.1, alpha = 0.2, size = 3) +
  xlim(0,1.5)+
  ylim(0,70)+
    labs(
    title = "Relationship Between Housing Prices and Density",
    subtitle = "Each point represents one of the districts the study area,\n Values in mean listed value, DKK/sqm, 2022 for all structure types",
    caption = "Data: City of Copenhagen, Finans Danmark, OpenData DK, Resk (2020)")+
  ylab("Mean District Sales Prices(DKK/sqm)")+
  xlab("Fixed Transit Stations / km2")+
  plotTheme

6.3 Urban Variables and Venue Density

In this section we present correlations between urban variables and venue density.

The tables of Pearson’s R (or Pearson’s Rho) are measures of statistical correlation. 1 is a strong positive correlation, -1 is a strong negative correlation, and 0 is no correlation.

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2,
         rent_t2 = as.numeric(rent_t2),
         income_t2 = as.numeric(income_t2)) %>%
  as.data.frame() %>%
  select(venue_dens, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-venue_dens, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
  ggplot()+
  geom_point(aes(y = venue_dens, x = value), color = CityPalette[6], alpha = 0.6)+
    facet_wrap(~variable, scales = "free")+
  geom_line(stat = "smooth", method='lm', 
            aes(y = venue_dens, x = value), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  labs(
    title = "Venue Density as a Function of Urban Variables",
    subtitle = "Each point represents one of the districts with venues in the survey",
    caption = "Data: CFP, City of Copenhagen, OpenData DK, Finans Danmark, Resk (2020)")+
  ylab("Venues / km2")+
  xlab("")+
    plotTheme

d2_aggregates %>%
    mutate(venue_dens = venue_count / area_km2,
           mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
           pop_youth_pct_t2 = 100 * pop_youth_pct_t2,
           rent_t2 = as.numeric(rent_t2),
           income_t2 = as.numeric(income_t2)) %>%
    as.data.frame() %>%
    select(venue_dens, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
    gather(-venue_dens, key = "variable", value = "value") %>%
    mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                                variable == "pop_per_km2_t2" ~ "Population / km2",
                                variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                                variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                                variable == "income_t2" ~ "Median Weekly Household Income",
                                variable == "rent_t2" ~ "Median Weekly Rent")) %>% 
  filter(is.na(venue_dens) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), venue_dens, method = "pearson")) %>%
  kable() %>%
  kable_styling()
variable pearsons_r
Mean Pedestrian Frequency (Max - 3) 0.2461591
Median Weekly Household Income 0.3334928
Median Weekly Rent 0.7294358
Percentage of Young Adults 0.5152338
Population / km2 0.6312582
Transit Stations / km2 0.4714336

Copenhagen has similar economic geography to other European, mono-centric cities where transit density and incomes density have a positive correlation.

Some data are missing for Berlin and Tokyo…

It’s worth noting that Montreal and Sydney have auto-oriented urban form, and other cities in the CFP sample have transit-oriented form and land markets.

d1_aggregates %>% 
  filter(city %in% c("Montreal", "Stockholm")) %>%
  mutate(transit_stations_per_km2 = ifelse(is.na(transit_stations_per_km2) == TRUE, 0, transit_stations_per_km2)) %>% 
  as.data.frame() %>%
  select(income_t2, transit_stations_per_km2, city) %>%
  rename(city_en = city) %>%
  rbind(.,
        d2_aggregates_all %>% 
  filter(city_en %in% c("Sydney", "New York", "Rotterdam")) %>%
  mutate(transit_stations_per_km2 = ifelse(is.na(transit_stations_per_km2) == TRUE, 
                                           0, transit_stations_per_km2))%>%
    as.data.frame() %>%
  select(income_t2, transit_stations_per_km2, city_en) %>%
  rename(city_en = city_en)) %>%
   rbind(., d2_aggregates %>%
             mutate(transit_stations_per_km2 = ifelse(is.na(transit_stations_per_km2) == TRUE, 0,
                                                      transit_stations_per_km2)) %>%
                     as.data.frame() %>%
  select(income_t2, transit_stations_per_km2, city_en) %>%
  rename(city_en = city_en) ) %>%
  ggplot()+ 
  geom_point(aes(y = as.numeric(income_t2), x = transit_stations_per_km2),
             alpha = 0.6, color = CityPalette[6]) + 
  geom_line(stat = "smooth", method='lm', 
            aes(y = as.numeric(income_t2), x = transit_stations_per_km2), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  facet_wrap(~city_en, scales = "free")+
  labs(
    title = "Area Income as a Function of Transit Density",
    subtitle = "Each point represents a top level administrative district",
    caption = "CFP, US Census Bureau, City of New York, \nGeodaten aus Deutchland, daten.berlin.de, statistikdatabasen.scb.se\nOpen Data Montreal, Statistics Canada, New South Wales Open Data, Australian Bureau of Statistics, Centraal Bureau voor de Statistiek, City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  ylab("Average District Income in Local Currency\nTime Periods Vary")+
  xlab("Fixed Transit Stops per km2 by District")+
  plotTheme

6.4 Urban variables and venue characteristics

Copenhagen’s venues tend to cluster near transit - and so do higher rents and incomes. Meanwhile, program ratings are negatively correlated with rents, incomes, and transit density - there is lower program rating nearer to the center.

6.4.1 Urban variables and Experimentation

Experimentation seems to be inversely related to property values - where there is higher rent, there is lower experimentation.

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_experimentation, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_experimentation, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
  ggplot()+
  geom_point(aes(y = mean_experimentation, x = as.numeric(value)), color = CityPalette[6], alpha = 0.6)+
    facet_wrap(~variable, scales = "free")+
  geom_line(stat = "smooth", method='lm', 
            aes(y = mean_experimentation, x = as.numeric(value)), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  labs(
    title = "Mean Experimentation Score as a Function of Urban Variables",
    subtitle = "Each point represents a district with venues in the survey",
    caption = "Data: City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  ylab("Experimentation Score")+
  xlab("")+
    plotTheme

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_experimentation, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_experimentation, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>% 
  filter(is.na(mean_experimentation) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_experimentation, method = "pearson")) %>%
  kable() %>%
  kable_styling()
variable pearsons_r
Mean Pedestrian Frequency (Max - 3) 0.3231595
Median Weekly Household Income -0.3171473
Median Weekly Rent -0.2304216
Percentage of Young Adults -0.3038922
Population / km2 0.4557105
Transit Stations / km2 0.0838026

6.4.2 Urban variables and Creative Output

Creative output is negatively correlated with income and property values.

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_creative_output, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_creative_output, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
  ggplot()+
  geom_point(aes(y = mean_creative_output, x = as.numeric(value)), color = CityPalette[6], alpha = 0.6)+
    facet_wrap(~variable, scales = "free")+
  geom_line(stat = "smooth", method='lm', 
            aes(y = mean_creative_output, x = as.numeric(value)), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  labs(
    title = "Mean Creative Output Score as a Function of Urban Variables",
    subtitle = "Each point represents a district with venues in the survey",
    caption = "Data: City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  ylab("Creative Output Score")+
  xlab("")+
    plotTheme

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_creative_output, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_creative_output, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>% 
  filter(is.na(mean_creative_output) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_creative_output, method = "pearson")) %>%
  kable() %>%
  kable_styling()
variable pearsons_r
Mean Pedestrian Frequency (Max - 3) 0.5050916
Median Weekly Household Income -0.4431312
Median Weekly Rent -0.4339722
Percentage of Young Adults -0.2084346
Population / km2 0.0515920
Transit Stations / km2 -0.7357791

6.4.3 Urban variables and Promotion

Again, we see negative correlations with this metric and rents/income

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_promotion, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_promotion, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
  ggplot()+
  geom_point(aes(y = mean_promotion, x = as.numeric(value)), color = CityPalette[6], alpha = 0.6)+
    facet_wrap(~variable, scales = "free")+
  geom_line(stat = "smooth", method='lm', 
            aes(y = mean_promotion, x = as.numeric(value)), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  labs(
    title = "Mean Promotion Score as a Function of Urban Variables",
    subtitle = "Each point represents an district with venues in the survey",
    caption = "Data: City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  ylab("Mean Promotion Score")+
  xlab("")+
    plotTheme

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_promotion, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_promotion, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>% 
  filter(is.na(mean_promotion) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_promotion, method = "pearson")) %>%
  kable() %>%
  kable_styling()
variable pearsons_r
Mean Pedestrian Frequency (Max - 3) 0.4002502
Median Weekly Household Income -0.3360219
Median Weekly Rent -0.4988834
Percentage of Young Adults -0.4492016
Population / km2 -0.1317831
Transit Stations / km2 -0.7351295

6.4.4 Urban variables and Community Focus

Community focus is negatively correlated with rents and income.

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_community_focus, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_community_focus, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
  ggplot()+
  geom_point(aes(y = mean_community_focus, x = as.numeric(value)), color = CityPalette[6], alpha = 0.6)+
    facet_wrap(~variable, scales = "free")+
  geom_line(stat = "smooth", method='lm', 
            aes(y = mean_community_focus, x = as.numeric(value)), 
            se = FALSE, 
            linetype = "dashed", alpha = 0.5)+
  labs(
    title = "Mean Community Focus Score as a Function of Urban Variables",
    subtitle = "Each point represents a district with venues in the survey",
    caption = "Data: City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  ylab("Community Focus Score")+
  xlab("")+
    plotTheme

d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_community_focus, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_community_focus, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Median Weekly Household Income",
                              variable == "rent_t2" ~ "Median Weekly Rent")) %>%
filter(is.na(mean_community_focus) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_community_focus, method = "pearson")) %>%
  kable() %>%
  kable_styling()
variable pearsons_r
Mean Pedestrian Frequency (Max - 3) 0.4364714
Median Weekly Household Income -0.4954926
Median Weekly Rent -0.3362898
Percentage of Young Adults 0.0725070
Population / km2 0.3825478
Transit Stations / km2 -0.0087043

6.4.5. All Variables Pearson Table

What is the relationship between the following urban factors and the programming or venue density characteristics of Districts?

Over .6 - strong .2 - .6 - Correlated Under .2 - No correlation

Sources: City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK

d2_aggregates %>%
    mutate(venue_dens = venue_count / area_km2,
           mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
           pop_youth_pct_t2 = 100 * pop_youth_pct_t2,
           rent_t2 = as.numeric(rent_t2),
           income_t2 = as.numeric(income_t2)) %>%
    as.data.frame() %>%
    select(venue_dens, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
    gather(-venue_dens, key = "variable", value = "value") %>%
    mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                                variable == "pop_per_km2_t2" ~ "Population / km2",
                                variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                                variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                                variable == "income_t2" ~ "Income",
                                variable == "rent_t2" ~ "Rent")) %>% 
  filter(is.na(venue_dens) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), venue_dens, method = "pearson")) %>%
  mutate(pearsons_r = case_when(pearsons_r >= 0.6 ~ "Positive (Strong)",
                                pearsons_r < 0.6 & pearsons_r >= 0.2 ~ "Positive",
                                pearsons_r < 0.2 & pearsons_r > -0.2 ~ "No Correlation",
                                pearsons_r <= -0.6 ~ "Negative (Strong)",
                                pearsons_r > -0.6 & pearsons_r <= -0.2 ~ "Negative")) %>%
  spread(-pearsons_r, -variable) %>% 
  mutate(Variable = "Venue Density")%>%
  rbind(., d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_experimentation, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_experimentation, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Income",
                              variable == "rent_t2" ~ "Rent")) %>% 
  filter(is.na(mean_experimentation) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_experimentation, method = "pearson")) %>%
     mutate(pearsons_r = case_when(pearsons_r >= 0.6 ~ "Positive (Strong)",
                                pearsons_r < 0.6 & pearsons_r >= 0.2 ~ "Positive",
                                pearsons_r < 0.2 & pearsons_r > -0.2 ~ "No Correlation",
                                pearsons_r <= -0.6 ~ "Negative (Strong)",
                                pearsons_r > -0.6 & pearsons_r <= -0.2 ~ "Negative")) %>%
  spread(-pearsons_r, -variable) %>% 
  mutate(Variable = "Experimentation Likelihood")) %>%
  rbind(., d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_creative_output, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_creative_output, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Income",
                              variable == "rent_t2" ~ "Rent")) %>% 
  filter(is.na(mean_creative_output) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_creative_output, method = "pearson")) %>%
    mutate(pearsons_r = case_when(pearsons_r >= 0.6 ~ "Positive (Strong)",
                                pearsons_r < 0.6 & pearsons_r >= 0.2 ~ "Positive",
                                pearsons_r < 0.2 & pearsons_r > -0.2 ~ "No Correlation",
                                pearsons_r <= -0.6 ~ "Negative (Strong)",
                                pearsons_r > -0.6 & pearsons_r <= -0.2 ~ "Negative")) %>%
  spread(-pearsons_r, -variable) %>% 
  mutate(Variable = "Creative Output Likelihood")) %>%
  rbind(., d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_promotion, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_promotion, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Income",
                              variable == "rent_t2" ~ "Rent")) %>% 
  filter(is.na(mean_promotion) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_promotion, method = "pearson")) %>%
   mutate(pearsons_r = case_when(pearsons_r >= 0.6 ~ "Positive (Strong)",
                                pearsons_r < 0.6 & pearsons_r >= 0.2 ~ "Positive",
                                pearsons_r < 0.2 & pearsons_r > -0.2 ~ "No Correlation",
                                pearsons_r <= -0.6 ~ "Negative (Strong)",
                                pearsons_r > -0.6 & pearsons_r <= -0.2 ~ "Negative")) %>%
  spread(-pearsons_r, -variable) %>% 
  mutate(Variable = "Artistic Promotion Likelihood")) %>%
  rbind(., d2_aggregates %>%
  mutate(venue_dens = venue_count / area_km2,
         mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
         pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
  as.data.frame() %>%
  select(mean_community_focus, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, mean_ped_freq, income_t2, rent_t2) %>%
  gather(-mean_community_focus, key = "variable", value = "value") %>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Stations / km2",
                              variable == "pop_per_km2_t2" ~ "Population / km2",
                              variable == "mean_ped_freq" ~ "Mean Pedestrian Frequency (Max - 3)",
                              variable == "pop_youth_pct_t2" ~ "Percentage of Young Adults",
                              variable == "income_t2" ~ "Income",
                              variable == "rent_t2" ~ "Rent")) %>%
filter(is.na(mean_community_focus) == FALSE) %>% 
  group_by(variable) %>% 
  summarize(pearsons_r = cor(as.numeric(value), mean_community_focus, method = "pearson")) %>%
  mutate(pearsons_r = case_when(pearsons_r >= 0.6 ~ "Positive (Strong)",
                                pearsons_r < 0.6 & pearsons_r >= 0.2 ~ "Positive",
                                pearsons_r < 0.2 & pearsons_r > -0.2 ~ "No Correlation",
                                pearsons_r <= -0.6 ~ "Negative (Strong)",
                                pearsons_r > -0.6 & pearsons_r <= -0.2 ~ "Negative")) %>%
  spread(-pearsons_r, -variable) %>% 
  mutate(Variable = "Community Focus Likelihood")) %>%
  kable() %>%
  kable_styling()
Income Mean Pedestrian Frequency (Max - 3) Percentage of Young Adults Population / km2 Rent Transit Stations / km2 Variable
Positive Positive Positive Positive (Strong) Positive (Strong) Positive Venue Density
Negative Positive Negative Positive Negative No Correlation Experimentation Likelihood
Negative Positive Negative No Correlation Negative Negative (Strong) Creative Output Likelihood
Negative Positive Negative No Correlation Negative Negative (Strong) Artistic Promotion Likelihood
Negative Positive No Correlation Positive Negative No Correlation Community Focus Likelihood

6.5. Multi-city Density Comparison

Presently (8/26/2024) the Copenhagen analysis does not include Frederiksberg, so total population and venue numbers reflect only those in the relevant districts themselves.

Copenhagen has a relatively high per-capita and per area density of venues among CFP cities. These densities are higher than any city except Sydney (where most of the city was excluded from the study, so densities are artificially high). Those venues are relatively widely distributed - no single district ranks among the top 25 districts in the 8-city CFP sample.

d1_aggregates %>%
  as.data.frame() %>%
  filter(city != "Sydney") %>%
  group_by(city) %>%
  summarize(total_pop = sum(pop_t2),
            total_venues = sum(venue_count, na.rm = TRUE),
            area_sqkm = sum(area_km2)) %>%
  rbind(.,
        d2_aggregates_all %>%
  as.data.frame() %>%
  filter(city %in% c("Sydney", "Rotterdam")) %>%
  group_by(city) %>%
  summarize(total_pop = sum(pop_t2),
            total_venues = sum(venue_count, na.rm = TRUE),
            area_sqkm = sum(area_km2))) %>%
   rbind(.,
        d2_aggregates %>%
  as.data.frame() %>%
  mutate(city = "Copenhagen") %>%
  group_by(city) %>%
  summarize(total_pop = sum(as.numeric(pop_t2)),
            total_venues = sum(venue_count, na.rm = TRUE),
            area_sqkm = sum(area_km2))) %>%
  mutate(venues_per_10k = total_venues / (total_pop/10000),
         venue_sqkm = total_venues / area_sqkm) %>% arrange(desc(total_pop)) %>%
  kable(
    col.names = c("City","Population","Number of Venues","Area (sq km)","Number of Venues per 10k People","Venues Density (/sq km)")
  ) %>% 
kable_styling()
City Population Number of Venues Area (sq km) Number of Venues per 10k People Venues Density (/sq km)
東京都 9272740 581 656.71911 0.6265678 0.8847009
New York 8426743 493 1211.89980 0.5850422 0.4067993
Berlin 3517424 495 887.57482 1.4072799 0.5576995
Montreal 2004265 265 616.72228 1.3221805 0.4296910
Stockholm 975551 96 214.58924 0.9840593 0.4473663
Rotterdam 650505 67 290.13636 1.0299690 0.2309259
Copenhagen 648036 108 92.35782 1.6665741 1.1693650
Sydney 394178 240 70.44623 6.0886199 3.4068536
d1_aggregates %>%
  as.data.frame() %>%
  filter(city != "Sydney" & 
           city != "New York") %>%
  select(city, d1_name, venue_count, area_km2) %>%
  rbind(.,
        d2_aggregates_all %>%
  as.data.frame() %>%
  filter(city %in% c("Sydney", "New York", "Rotterdam")) %>%
  select(city, venue_count, d2_name, area_km2) %>%
    rename(d1_name = d2_name)) %>%
  rbind(., d2_aggregates %>%
          as.data.frame() %>%
          select(venue_count, d2_name, area_km2) %>%
          rename(d1_name = d2_name) %>%
          mutate(city = "Copenhagen")) %>%
  mutate(venue_density = venue_count / area_km2) %>%
  arrange(-venue_density) %>%
  mutate(rank = row_number()) %>%
  kable(
    col.names = c("City","District Name","Venue Count", "Area (km2)", "Venues/km2", "Density Rank")
  ) %>% 
kable_styling() %>%
 scroll_box(height = "200px")
City District Name Venue Count Area (km2) Venues/km2 Density Rank
Sydney Darlinghurst 27 0.8573069 31.4939735 1
New York Midtown-Midtown South 54 2.8002533 19.2839694 2
New York Chinatown 31 1.6297224 19.0216447 3
New York East Village 18 1.0105534 17.8120223 4
Sydney Sydney - Haymarket - The Rocks 68 4.2948737 15.8328289 5
Sydney Potts Point - Woolloomooloo 18 1.4600147 12.3286433 6
New York North Side-South Side 32 2.6573944 12.0418707 7
Sydney Surry Hills 15 1.3166746 11.3923361 8
東京都 渋谷区 167 15.1194158 11.0454003 9
New York West Village 35 3.3916615 10.3194260 10
Montreal Le Plateau-Mont-Royal 78 8.1341650 9.5891834 11
Rotterdam Rotterdam Centrum 44 4.9064069 8.9678660 12
Berlin Friedrichshain-Kreuzberg 172 20.2479799 8.4946746 13
New York Clinton 20 2.5256447 7.9187702 14
New York Bushwick North 18 2.3070437 7.8021931 15
Sydney Newtown - Camperdown - Darlington 25 3.2819773 7.6173593 16
New York Central Harlem South 10 1.3394146 7.4659479 17
New York Gramercy 5 0.6940387 7.2042088 18
New York Bushwick South 26 3.7333007 6.9643467 19
Sydney Redfern - Chippendale 15 2.1633914 6.9335582 20
New York Hudson Yards-Chelsea-Flat Iron-Union Square 27 4.7366269 5.7002589 21
Montreal Ville-Marie 122 21.4545676 5.6864348 22
東京都 新宿区 87 18.2705806 4.7617534 23
Sydney Pyrmont - Ultimo 7 1.4915006 4.6932598 24
Stockholm Norrmalm 26 5.5474592 4.6868304 25
New York Brooklyn Heights-Cobble Hill 4 0.9533316 4.1958117 26
Copenhagen Nørrebro 17 4.0863623 4.1601794 27
Copenhagen Indre By 42 10.4183129 4.0313629 28
New York SoHo-TriBeCa-Civic Center-Little Italy 12 3.0928997 3.8798542 29
New York Upper East Side-Carnegie Hill 7 1.8619321 3.7595357 30
New York East Williamsburg 14 3.8341362 3.6514092 31
東京都 港区 79 24.4581428 3.2300081 32
Stockholm Södermalm 33 10.3189928 3.1979865 33
New York Greenpoint 11 3.4764871 3.1641136 34
New York Prospect Heights 3 0.9591657 3.1277182 35
Berlin Mitte 120 39.3129068 3.0524326 36
New York Park Slope-Gowanus 12 3.9437843 3.0427628 37
New York Lincoln Square 6 2.1100959 2.8434726 38
Sydney Glebe - Forest Lodge 6 2.3023174 2.6060699 39
Sydney Paddington - Moore Park 9 3.7256775 2.4156680 40
New York Stuyvesant Heights 7 2.9141911 2.4020387 41
New York Parkchester 2 0.8518291 2.3478888 42
Sydney Petersham - Stanmore 7 3.0036443 2.3305023 43
New York Erasmus 3 1.3482579 2.2250936 44
New York Manhattanville 3 1.4154807 2.1194214 45
東京都 豊島区 27 12.9723488 2.0813501 46
New York Marble Hill-Inwood 4 1.9680156 2.0325042 47
New York Lower East Side 8 4.0192714 1.9904105 48
New York Bedford 6 3.0276690 1.9817226 49
New York Fort Greene 3 1.5308090 1.9597481 50
Copenhagen Vesterbro/Kongens Enghave 16 8.2202277 1.9464181 51
New York Ridgewood 9 4.6736880 1.9256741 52
Sydney Erskineville - Alexandria 8 4.3214310 1.8512386 53
New York Turtle Bay-East Midtown 5 2.7382373 1.8259922 54
Sydney Marrickville 10 5.8143386 1.7198861 55
Sydney Lilyfield - Rozelle 6 3.6052202 1.6642534 56
New York Clinton Hill 3 1.9022405 1.5770877 57
東京都 中央区 17 10.9365767 1.5544169 58
Sydney Balmain 4 2.5818140 1.5492983 59
New York Central Harlem North-Polo Grounds 4 2.6257977 1.5233466 60
東京都 杉並区 46 33.9314315 1.3556752 61
Sydney Leichhardt - Annandale 6 4.4737785 1.3411482 62
東京都 千代田区 15 11.3650272 1.3198385 63
Berlin Neukölln 59 44.7557902 1.3182652 64
東京都 台東区 13 10.0868731 1.2888038 65
Montreal Le Sud-Ouest 23 18.1055520 1.2703286 66
New York Mott Haven-Port Morris 6 4.9820337 1.2043275 67
Copenhagen Bispebjerg 8 6.7957940 1.1771987 68
New York DUMBO-Vinegar Hill-Downtown Brooklyn-Boerum Hill 3 2.6205028 1.1448185 69
東京都 世田谷区 64 58.1264075 1.1010486 70
New York Crown Heights North 5 4.7919635 1.0434136 71
New York Spuyten Duyvil-Kingsbridge 3 2.9183674 1.0279720 72
New York Morningside Heights 3 2.9194324 1.0275970 73
Copenhagen Østerbro 9 9.7229774 0.9256424 74
New York Murray Hill-Kips Bay 2 2.1786399 0.9180039 75
Stockholm Östermalm 21 23.9306360 0.8775362 76
Rotterdam Delfshaven 5 5.9304907 0.8431006 77
New York Upper West Side 4 4.8152846 0.8306882 78
Montreal Rosemont-La Petite-Patrie 13 15.8523232 0.8200691 79
Sydney Ashfield 3 3.6704585 0.8173366 80
New York Co-op City 3 3.7074211 0.8091878 81
Montreal Outremont 3 3.8051755 0.7883999 82
New York Stuyvesant Town-Cooper Village 1 1.3061074 0.7656338 83
Sydney Dulwich Hill - Lewisham 2 2.7024868 0.7400591 84
東京都 中野区 11 15.5821578 0.7059356 85
Rotterdam Feijenoord 6 8.5230683 0.7039718 86
New York Prospect Lefferts Gardens-Wingate 2 2.9357728 0.6812516 87
New York Crown Heights South 1 1.4831464 0.6742423 88
New York Highbridge 1 1.5081489 0.6630645 89
New York Crotona Park East 1 1.5298447 0.6536611 90
New York East Flatbush-Farragut 2 3.1870228 0.6275449 91
東京都 文京区 7 11.3704312 0.6156319 92
東京都 目黒区 9 14.8401749 0.6064619 93
New York Hunts Point 4 6.5990226 0.6061504 94
New York East Concourse-Concourse Village 1 1.6918630 0.5910644 95
New York West Concourse 1 1.7683954 0.5654844 96
Stockholm Enskede-Årsta-Vantör 12 21.4558373 0.5592884 97
Rotterdam Noord 3 5.3672085 0.5589498 98
Sydney Waterloo - Beaconsfield 2 3.5822053 0.5583153 99
Copenhagen Amager Øst 5 9.2973689 0.5377866 100
New York Sunset Park West 5 9.3969302 0.5320887 101
Berlin Pankow 54 102.6967098 0.5258202 102
Sydney Sydenham - Tempe - St Peters 2 3.8551239 0.5187901 103
Berlin Tempelhof-Schöneberg 27 52.9136174 0.5102656 104
New York Washington Heights North 2 3.9690593 0.5038977 105
New York Elmhurst-Maspeth 1 2.0280964 0.4930732 106
Montreal Villeray-Saint-Michel-Parc-Extension 8 16.4418483 0.4865633 107
Copenhagen Vanløse 3 6.6652325 0.4500968 108
New York Queensbridge-Ravenswood-Long Island City 1 2.2230057 0.4498414 109
New York Westchester-Unionport 1 2.2507352 0.4442993 110
Rotterdam Charlois 5 11.8990066 0.4202031 111
New York East Harlem South 1 2.3844742 0.4193797 112
New York Van Cortlandt Village 1 2.3941893 0.4176779 113
New York Carroll Gardens-Columbia Street-Red Hook 4 9.7929812 0.4084558 114
New York Woodside 1 2.6240659 0.3810880 115
東京都 墨田区 5 13.7417454 0.3638548 116
New York Allerton-Pelham Gardens 1 2.9382851 0.3403346 117
New York Brownsville 1 3.0332438 0.3296801 118
Montreal Mercier-Hochelaga-Maisonneuve 9 27.3492733 0.3290764 119
Copenhagen Valby 3 9.1885094 0.3264947 120
Berlin Charlottenburg-Wilmersdorf 21 64.4375733 0.3258968 121
New York Seagate-Coney Island 2 6.2456601 0.3202224 122
New York Maspeth 1 3.3213158 0.3010855 123
New York Washington Heights South 1 3.3495690 0.2985459 124
東京都 品川区 8 27.0119411 0.2961653 125
New York Kew Gardens Hills 1 3.5287973 0.2833827 126
Montreal Côte-des-Neiges-Notre-Dame-de-Grâce 6 21.4378047 0.2798794 127
New York Woodlawn-Wakefield 1 3.6446790 0.2743726 128
New York Astoria 1 3.6495687 0.2740050 129
New York Bensonhurst East 1 3.6718804 0.2723400 130
New York South Jamaica 1 3.7140168 0.2692503 131
New York Battery Park City-Lower Manhattan 1 3.7381125 0.2675147 132
New York Flatbush 1 4.1978295 0.2382183 133
New York New Brighton-Silver Lake 1 4.3793368 0.2283451 134
New York Jamaica 1 4.3848573 0.2280576 135
New York West New Brighton-New Brighton-St. George 2 8.8469541 0.2260665 136
New York Schuylerville-Throgs Neck-Edgewater Park 4 18.8174972 0.2125681 137
New York Richmond Hill 1 4.7353359 0.2111783 138
Copenhagen Amager Vest 4 19.2663407 0.2076160 139
New York Hunters Point-Sunnyside-West Maspeth 2 9.9182923 0.2016476 140
Berlin Lichtenberg 9 51.9383369 0.1732824 141
Rotterdam IJsselmonde 2 13.1009552 0.1526606 142
New York Steinway 1 6.8263689 0.1464908 143
東京都 北区 3 20.4957190 0.1463720 144
New York North Riverdale-Fieldston-Riverdale 1 6.9567470 0.1437453 145
New York Sheepshead Bay-Gerritsen Beach-Manhattan Beach 1 7.3791060 0.1355178 146
東京都 江東区 7 55.2347111 0.1267319 147
東京都 練馬区 6 48.1419087 0.1246315 148
Stockholm Hägersten-Älvsjö 3 24.4447288 0.1227258 149
Copenhagen Brønshøj-Husum 1 8.6966948 0.1149862 150
New York College Point 1 8.7223805 0.1146476 151
Berlin Treptow-Köpenick 18 167.0701970 0.1077391 152
Stockholm Skärholmen 1 10.1425427 0.0985946 153
Berlin Marzahn-Hellersdorf 6 61.5770031 0.0974390 154
東京都 江戸川区 4 49.3394958 0.0810710 155
Rotterdam Kralingen-Crooswijk 1 12.7215597 0.0786067 156
Montreal Saint-Léonard 1 13.5213926 0.0739569 157
東京都 大田区 5 75.1911276 0.0664972 158
Berlin Spandau 6 91.5120393 0.0655651 159
New York New Springville-Bloomfield-Travis 2 31.9838943 0.0625315 160
Rotterdam Prins Alexander 1 18.6627216 0.0535828 161
Montreal Lachine 1 23.0784749 0.0433304 162
Montreal Ahuntsic-Cartierville 1 25.5160844 0.0391910 163
Berlin Reinickendorf 3 88.9636853 0.0337216 164
New York park-cemetery-etc-Bronx 1 29.9930264 0.0333411 165
東京都 足立区 1 53.2578235 0.0187766 166
東京都 荒川区 NA 10.2287383 NA 167
東京都 板橋区 NA 32.1918125 NA 168
東京都 葛飾区 NA 34.8245166 NA 169
Berlin Steglitz-Zehlendorf NA 102.1489826 NA 170
Stockholm Kungsholmen NA 6.9429741 NA 171
Stockholm Farsta NA 17.1135265 NA 172
Stockholm Bromma NA 27.6206531 NA 173
Stockholm Hässelby-Vällingby NA 24.8591388 NA 174
Stockholm Skarpnäck NA 17.4620933 NA 175
Stockholm Spånga-Tensta NA 12.8603071 NA 176
Stockholm Rinkeby-Kista NA 11.8903550 NA 177
Montreal LaSalle NA 25.1437195 NA 178
Montreal Mont-Royal NA 7.4295857 NA 179
Montreal Hampstead NA 1.7642744 NA 180
Montreal Rivière-des-Prairies-Pointe-aux-Trembles NA 49.9380360 NA 181
Montreal Dorval NA 28.0959506 NA 182
Montreal Montréal-Nord NA 12.4032778 NA 183
Montreal L’Île-Bizard-Sainte-Geneviève NA 36.4534924 NA 184
Montreal Kirkland NA 9.6667690 NA 185
Montreal Dollard-des-Ormeaux NA 15.0327412 NA 186
Montreal Senneville NA 18.5697951 NA 187
Montreal Côte-Saint-Luc NA 6.7956587 NA 188
Montreal Montréal-Ouest NA 1.4164230 NA 189
Montreal Pointe-Claire NA 34.3728005 NA 190
Montreal L’Île-Dorval NA 0.1801239 NA 191
Montreal Saint-Laurent NA 42.9853936 NA 192
Montreal Beaconsfield NA 24.8691706 NA 193
Montreal Westmount NA 4.0077149 NA 194
Montreal Montréal-Est NA 13.9436851 NA 195
Montreal Anjou NA 13.8481238 NA 196
Montreal Pierrefonds-Roxboro NA 33.6924740 NA 197
Montreal Sainte-Anne-de-Bellevue NA 11.1265670 NA 198
Montreal Verdun NA 22.2811389 NA 199
Montreal Baie-d’Urfé NA 8.0087000 NA 200
New York Brighton Beach NA 1.8937054 NA 201
New York West Brighton NA 0.8134721 NA 202
New York Homecrest NA 2.7826312 NA 203
New York Gravesend NA 3.3490590 NA 204
New York Bath Beach NA 2.4803128 NA 205
New York Bensonhurst West NA 4.3962015 NA 206
New York Dyker Heights NA 2.7740491 NA 207
New York Bay Ridge NA 12.9486553 NA 208
New York Sunset Park East NA 2.5135895 NA 209
New York Windsor Terrace NA 1.3030339 NA 210
New York Kensington-Ocean Parkway NA 1.4746683 NA 211
New York Midwood NA 3.3226713 NA 212
New York Madison NA 2.5403645 NA 213
New York Georgetown-Marine Park-Bergen Beach-Mill Basin NA 8.1416548 NA 214
New York Ocean Parkway South NA 1.6540101 NA 215
New York Canarsie NA 8.4307076 NA 216
New York Flatlands NA 5.0497382 NA 217
New York Williamsburg NA 1.0754648 NA 218
New York Ocean Hill NA 1.8627183 NA 219
New York East New York NA 11.7778366 NA 220
New York Cypress Hills-City Line NA 2.5532849 NA 221
New York East New York (Pennsylvania Ave) NA 1.8029009 NA 222
New York Borough Park NA 5.0074899 NA 223
New York Starrett City NA 1.2066294 NA 224
New York Rugby-Remsen Village NA 3.0292675 NA 225
New York park-cemetery-etc-Brooklyn NA 65.2042848 NA 226
New York Claremont-Bathgate NA 1.5248039 NA 227
New York Eastchester-Edenwald-Baychester NA 3.7251272 NA 228
New York Bedford Park-Fordham North NA 1.3895957 NA 229
New York Belmont NA 1.2664631 NA 230
New York Bronxdale NA 1.4107283 NA 231
New York West Farms-Bronx River NA 1.3932731 NA 232
New York Soundview-Castle Hill-Clason Point-Harding Park NA 7.5660253 NA 233
New York Pelham Bay-Country Club-City Island NA 12.1768064 NA 234
New York East Tremont NA 1.7998605 NA 235
New York Kingsbridge Heights NA 1.2430887 NA 236
New York Longwood NA 0.9965086 NA 237
New York Melrose South-Mott Haven North NA 1.6038093 NA 238
New York Morrisania-Melrose NA 1.5660179 NA 239
New York University Heights-Morris Heights NA 1.9881846 NA 240
New York Van Nest-Morris Park-Westchester Square NA 3.3531194 NA 241
New York Fordham South NA 0.5845523 NA 242
New York Mount Hope NA 1.3641714 NA 243
New York Norwood NA 1.4587939 NA 244
New York Williamsbridge-Olinville NA 3.3677750 NA 245
New York Pelham Parkway NA 2.1465052 NA 246
New York Soundview-Bruckner NA 1.5012932 NA 247
New York Rikers Island NA 2.7017658 NA 248
New York Hamilton Heights NA 2.3087123 NA 249
New York Lenox Hill-Roosevelt Island NA 3.1099915 NA 250
New York Yorkville NA 1.8890459 NA 251
New York East Harlem North NA 2.9629487 NA 252
New York park-cemetery-etc-Manhattan NA 14.0007113 NA 253
New York Springfield Gardens North NA 2.6384820 NA 254
New York Springfield Gardens South-Brookville NA 4.0421927 NA 255
New York Rosedale NA 5.7313477 NA 256
New York Jamaica Estates-Holliswood NA 3.9788122 NA 257
New York Hollis NA 2.1224191 NA 258
New York St. Albans NA 7.1893115 NA 259
New York Breezy Point-Belle Harbor-Rockaway Park-Broad Channel NA 17.4701853 NA 260
New York Hammels-Arverne-Edgemere NA 10.8549352 NA 261
New York Far Rockaway-Bayswater NA 6.5394274 NA 262
New York Forest Hills NA 5.3685110 NA 263
New York Rego Park NA 1.8420460 NA 264
New York Glendale NA 2.7762328 NA 265
New York Middle Village NA 5.3728392 NA 266
New York Flushing NA 3.5178239 NA 267
New York Corona NA 1.8703544 NA 268
New York North Corona NA 1.6702686 NA 269
New York East Elmhurst NA 1.7926989 NA 270
New York Jackson Heights NA 4.4515635 NA 271
New York Elmhurst NA 3.0325558 NA 272
New York Cambria Heights NA 3.1203988 NA 273
New York Queens Village NA 6.5122115 NA 274
New York Briarwood-Jamaica Hills NA 2.7220111 NA 275
New York Pomonok-Flushing Heights-Hillcrest NA 3.6077920 NA 276
New York Fresh Meadows-Utopia NA 2.5721683 NA 277
New York Oakland Gardens NA 4.7162745 NA 278
New York Bellerose NA 5.0881230 NA 279
New York Glen Oaks-Floral Park-New Hyde Park NA 4.2337202 NA 280
New York Douglas Manor-Douglaston-Little Neck NA 7.3529631 NA 281
New York Bayside-Bayside Hills NA 8.1381071 NA 282
New York Ft. Totten-Bay Terrace-Clearview NA 8.8947762 NA 283
New York Auburndale NA 3.1742823 NA 284
New York Whitestone NA 9.4011905 NA 285
New York Murray Hill NA 4.8704594 NA 286
New York East Flushing NA 2.7341558 NA 287
New York Woodhaven NA 3.4480840 NA 288
New York South Ozone Park NA 7.5912561 NA 289
New York Ozone Park NA 2.3294517 NA 290
New York Lindenwood-Howard Beach NA 7.1703050 NA 291
New York Kew Gardens NA 1.8986363 NA 292
New York Queensboro Hill NA 2.4499584 NA 293
New York Laurelton NA 3.6747951 NA 294
New York Old Astoria NA 1.4729512 NA 295
New York Baisley Park NA 4.0884902 NA 296
New York Airport NA 24.3885253 NA 297
New York park-cemetery-etc-Queens NA 172.7338831 NA 298
New York Annadale-Huguenot-Prince’s Bay-Eltingville NA 19.6291150 NA 299
New York Westerleigh NA 5.8636530 NA 300
New York Grymes Hill-Clifton-Fox Hills NA 3.5181857 NA 301
New York Charleston-Richmond Valley-Tottenville NA 18.6336298 NA 302
New York Mariner’s Harbor-Arlington-Port Ivory-Graniteville NA 9.8380216 NA 303
New York Grasmere-Arrochar-Ft. Wadsworth NA 5.6777242 NA 304
New York Todt Hill-Emerson Hill-Heartland Village-Lighthouse Hill NA 17.1375623 NA 305
New York Oakwood-Oakwood Beach NA 6.1663325 NA 306
New York Port Richmond NA 3.9243841 NA 307
New York Rossville-Woodrow NA 6.0258564 NA 308
New York Old Town-Dongan Hills-South Beach NA 7.6134784 NA 309
New York Stapleton-Rosebank NA 8.3703705 NA 310
New York New Dorp-Midland Beach NA 6.2090210 NA 311
New York Arden Heights NA 4.6787025 NA 312
New York Great Kills NA 11.0024987 NA 313
New York park-cemetery-etc-Staten Island NA 85.1199744 NA 314
Sydney Burwood - Croydon NA 4.4941239 NA 315
Sydney Croydon Park - Enfield NA 3.9723088 NA 316
Sydney Haberfield - Summer Hill NA 3.4755633 NA 317
Rotterdam Overschie NA 17.3443818 NA 318
Rotterdam Hillegersberg-Schiebroek NA 13.2748738 NA 319
Rotterdam Pernis NA 1.5937589 NA 320
Rotterdam Hoogvliet NA 10.3442731 NA 321
Rotterdam Hoek van Holland NA 18.5904905 NA 322
Rotterdam Waalhaven-Eemhaven NA 15.0211502 NA 323
Rotterdam Vondelingenplaat NA 9.4912269 NA 324
Rotterdam Botlek-Europoort-Maasvlakte NA 115.5976172 NA 325
Rotterdam Rotterdam-Noord-West NA 1.1716582 NA 326
Rotterdam Rivium NA 0.1146757 NA 327
Rotterdam Rozenburg NA 6.4808323 NA 328

6.5.1 - Point Pattern Analysis of clustering and centrality

Source: CFP

Copenhagen’s venues are more tightly clustered with one another than any other CFP city - this is very notable, and perhaps indicative of the human-scale transit accessibility the city is known for.,

Average nn distance between a venue and its two nearest neighbors (in meters).

venues_pattern <- main_venue_data %>% 
  filter(city == "Rotterdam", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_rotterdam) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
  st_transform(4326) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Sydney", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_sydney) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Stockholm", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_stockholm) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
    st_transform(4326)) %>%
rbind(.,  main_venue_data %>% 
  filter(city == "Tokyo", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_tokyo) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Berlin", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_berlin) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Montreal", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_montreal) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3)) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "New York", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_new_york) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3) / 3.28) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Copenhagen", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_copenhagen) %>%
   mutate(venue_nn2 = nn_function(st_coordinates(.), 
                              st_coordinates(.), k = 3) / 3.28) %>%
    st_transform(4326))

venues_pattern %>%
  as.data.frame() %>%
  group_by(city) %>%
  summarize(mean_knn2_m = mean(venue_nn2),
            median_knn2_m = median(venue_nn2)) %>%
  kable() %>%
  kable_styling()
city mean_knn2_m median_knn2_m
Berlin 234.21637 96.06403
Copenhagen 82.03627 47.32980
Montreal 179.53679 98.11968
New York 255.98960 119.67304
Rotterdam 263.13005 131.46448
Stockholm 279.17632 126.71238
Sydney 137.42448 92.92099
Tokyo 183.46044 70.81106

On average, venues in Copenhagen are relatively dispersed IN GENERAL, but they are locally very clustered. This presents excellent opportunities for managed distircts, neighborhood-centered scenes. Global clustering is measured here as average distance to the the centroid of the venue cluster - the average X,Y location point).

centers <- main_venue_data %>%
  filter(is.na(x) == FALSE) %>%
  group_by(city) %>%
  summarize(mean_x = mean(x),
         mean_y = mean(y))


venues_centroid_dist <- main_venue_data %>% 
  filter(city == "Rotterdam", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_rotterdam) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Rotterdam") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_rotterdam))) %>%
  st_transform(4326) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Sydney", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_sydney) %>%
  mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Sydney") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_sydney))) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Stockholm", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_stockholm) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Stockholm") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_stockholm))) %>%
    st_transform(4326)) %>%
rbind(.,  main_venue_data %>% 
  filter(city == "Tokyo", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_tokyo) %>%
    mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Tokyo") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_tokyo))) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Berlin", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_berlin) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Berlin") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_berlin))) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Montreal", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_montreal) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Montreal") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_montreal))) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "New York", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_new_york) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "New York") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_new_york)) / 3.28) %>%
    st_transform(4326)) %>%
  rbind(.,  main_venue_data %>% 
  filter(city == "Copenhagen", is.na(x) == FALSE) %>% 
  st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_transform(epsg_copenhagen) %>%
   mutate(centroid_dist = st_distance(., centers %>%
                                    filter(city == "Copenhagen") %>%
                                    st_as_sf(coords = c("mean_x", "mean_y"), crs = 4326) %>%
  st_transform(epsg_copenhagen))) %>%
    st_transform(4326))

venues_centroid_dist %>%
  as.data.frame() %>%
  group_by(city) %>%
  summarize(mean_centroid_dist = mean(centroid_dist),
            median_centroid_dist = median(centroid_dist)) %>%
  kable() %>%
  kable_styling()
city mean_centroid_dist median_centroid_dist
Berlin 3484.748 2801.521
Copenhagen 2332.798 2215.211
Montreal 2172.774 1700.682
New York 5299.167 3612.162
Rotterdam 1465.930 1145.001
Stockholm 2066.532 1827.351
Sydney 2157.818 1926.532
Tokyo 4072.481 3099.027

7 Borough Thematic Mapping

Sources: CFP, City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK

7.1 Urban Characteristics

7.1.1 Urban Characteristics - Quantile Ranks

This “heatmap” chart shows the relatively high or low values of urban variables by district.

These charts are useful to reference in the context of the district profiles in Section 9.

Norrebro really sticks out here as being atypical - it has very high programming scores, high venue density, and good transit access - it also has relatively lower incomes and property values. Usually we don’t see this combination.

# Get quartile values for each of the d1 aggregates for various characteristics

d2_aggregates %>%
    mutate(venue_dens = venue_count / area_km2,
           #mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
           pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
    as.data.frame() %>%
    select(d2_name, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, 
           #mean_ped_freq, 
           income_t2, rent_t2, venue_dens) %>%
    gather(-d2_name, key = "variable", value = "value") %>% 
  mutate(value = as.numeric(value)) %>%
  group_by(variable) %>% 
  summarize(quantile = scales::percent(c(0.25, 0.5, 0.75)), 
            quartile = quantile(value, c(0.25, 0.5, 0.75), na.rm = TRUE)) %>% 
  spread(quantile, quartile) %>% # join it back to the original data analysis
  left_join(d2_aggregates %>%
    mutate(venue_dens = venue_count / area_km2,
          # mean_ped_freq = ((3*ped_freq_3) + (2* ped_freq_2) + (1* ped_freq_1))/venue_count,
           pop_youth_pct_t2 = 100 * pop_youth_pct_t2) %>%
    as.data.frame() %>%
    select(d2_name, transit_stations_per_km2, pop_per_km2_t2, pop_youth_pct_t2, 
           income_t2, rent_t2, venue_dens) %>%
    gather(-d2_name, key = "variable", value = "value"), .) %>%
  mutate(value = as.numeric(value)) %>%
  mutate(quantile = case_when(value >= `75%` ~ 4,
                              value <= `25%` ~ 1,
                              value > `25%` & value < `50%` ~ 2,
                              value >= `50%` & value < `75%` ~ 3))%>%
  mutate(variable = case_when(variable == "transit_stations_per_km2" ~ "Transit Density",
                              variable == "pop_per_km2_t2" ~ "Pop Density",
                              variable == "pop_youth_pct_t2" ~ "Pct Young Adults",
                              variable == "income_t2" ~ "Income",
                              variable == "rent_t2" ~ "Rent",
                              variable == "venue_dens" ~ "Venue Density")) %>%
  ggplot()+
  geom_tile(aes(y = d2_name, x = variable, fill = quantile),
            alpha = 0.6)+ #flip coord, order desc
  scale_fill_gradient(low = "white", high = CityPalette[5],
                     limits=c(1,4), breaks=c(1,4),
                     labels=c("Minimum","Maximum"),
                     name = "")+
  coord_flip()+
  labs(
    title = "District Planning & Population Characteristics",
     subtitle = "Districts with no venues have their density represented as grey",
     caption = "Data: CFP, City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  #ylab("Quartile")+
  #xlab("")+
    plotTheme

7.1.2 Program and Space Charcteristics - Quantile Ranks

# Get quartile values for each of the d1 aggregates for various characteristics

d2_aggregates %>%
    filter(venue_count > 0) %>%
    mutate(venue_dens = venue_count / area_km2,
           med_size = ((4* size_4) + (3*size_3) + (2* size_2) + (1* size_1))/venue_count) %>%
    as.data.frame() %>%
    select(d2_name, mean_experimentation, mean_creative_output, mean_community_focus, venue_dens, med_size, venue_count, mean_promotion) %>%
    gather(-d2_name, key = "variable", value = "value") %>% 
  mutate(value = as.numeric(value)) %>%
  group_by(variable) %>% 
  summarize(quantile = scales::percent(c(0.25, 0.5, 0.75)), 
            quartile = quantile(value, c(0.25, 0.5, 0.75), na.rm = TRUE)) %>% 
  spread(quantile, quartile) %>% # join it back to the original data analysis
  left_join(d2_aggregates %>%
    filter(venue_count > 0) %>%
    mutate(venue_dens = venue_count / area_km2,
           med_size = ((4* size_4) + (3*size_3) + (2* size_2) + (1* size_1))/venue_count) %>%
    as.data.frame() %>%
    select(d2_name, mean_experimentation, mean_creative_output, mean_community_focus, venue_dens, med_size, venue_count, mean_promotion) %>%
    gather(-d2_name, key = "variable", value = "value"), .) %>%
  mutate(value = as.numeric(value)) %>%
  mutate(quantile = case_when(value >= `75%` ~ 4,
                              value < `25%` ~ 1,
                              value >= `25%` & value < `50%` ~ 2,
                              value >= `50%` & value < `75%` ~ 3)) %>%
  mutate(variable = case_when(variable == "mean_experimentation" ~ "Experimental Content",
                              variable == "venue_dens" ~ "Venue Density",
                              variable == "venue_count" ~ "Total Venues",
                              variable == "mean_promotion" ~ "Promotion of\n Artistic Content",
                              variable == "med_size" ~ "Median Venue Size",
                              variable == "mean_creative_output" ~ "Creative Content",
                              variable == "mean_community_focus" ~ "Community Focus")) %>%
  ggplot()+
  geom_tile(aes(y = d2_name, x = variable, fill = quantile),
            alpha = 0.6)+ #flip coord, order desc
  scale_fill_gradient(low = "white", high = CityPalette[5],
                     limits=c(1,4), breaks=c(1,4),
                     labels=c("Minimum","Maximum"),
                     name = "")+
  coord_flip()+
  labs(
    title = "District Venue Characteristics",
     subtitle = "",
     caption = "Data: CFP, City of Copenhagen, Resk (2020), Finans Danmark, OpenData DK")+
  #ylab("Quartile")+
  #xlab("")+
    plotTheme

8 Supervized Classification Algorithms for Programming and Geographic Clusters

Experimental approach using a classifying clustering algorithm to determine geographic “neighborhoods” of nightlife amenities (geo-clusters) and thematic groupings of venues “program clusters”.

The geographic clustering approach is again, rather inconclusive, so it has been retired.

8.1 Geo-clusters

This first test is a “k-means” cluster analysis that is strictly looking for geographic clusters - this analysis functions by partitioning a cloud of points in multi-dimensional data to find cluster centroids, and assign each point to a cluster by minimizing within cluster variance.

set.seed(123)
library(cluster)
library(factoextra)

# function to compute total within-cluster sum of square 
wss <- function(k) {
  kmeans(main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(y) == FALSE) %>% 
           select(x, y), k, nstart = 10 )$tot.withinss
}

# Compute and plot wss for k = 1 to k = 15
k.values <- 1:15

# extract wss for 2-15 clusters
wss_values <- map_dbl(k.values, wss)

# Run "elbow plot" to determine optimal cluster number.
plot(k.values, wss_values,
       type="b", pch = 19, frame = FALSE, 
       xlab="Number of clusters K",
       ylab="Total within-clusters sum of squares")

# Create a data object with cluster numbers
test <- kmeans(main_venue_data %>% 
            filter(city == "Copenhagen" & 
                   is.na(y) == FALSE) %>% 
            select(x, y), centers = 4, nstart = 25)

This interactive map shows the location of these clusters:

cluster_data1 <- main_venue_data %>% 
    filter(city == "Copenhagen", 
    is.na(y) == FALSE) %>% 
  cbind(test$cluster) %>% 
  rename(cluster = 'test$cluster')

pal <- colorNumeric(c("red", "blue", "green"), 1:3)

l2 <- leaflet() %>% 
  addProviderTiles(providers$Esri.WorldTopoMap) %>%
  setView(lng = mean(cluster_data1$x, na.rm = TRUE),
          lat = mean(cluster_data1$y, na.rm = TRUE),
          zoom = 11) %>%
  addScaleBar(position = "topleft") %>%
  addCircleMarkers(data= cluster_data1,
                   lng=~x, 
                   lat=~y,
                   radius =~ 1, 
                   fillOpacity =~ 1,
                   color = ~pal(cluster),
                   label=~paste(name, street, " | Cluster: ", cluster))

l2

8.1.1 Geo cluster programming characteristics

main_venue_data %>% 
    filter(city == "Copenhagen", 
    is.na(y) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>% 
  cbind(test$cluster) %>% 
  rename(cluster = 'test$cluster') %>% 
  filter(is.na(years_operating) == FALSE) %>%
  group_by(cluster) %>%
  summarize(mean_size = mean(size),
            mean_promotion = mean(promotion),
            mean_creative_output = mean(creative_output),
            mean_experimentation = mean(experimentation),
            mean_community_focus = mean(community_focus),
            mean_age = mean(years_operating)) %>%
  gather(-cluster, key = "variable", value = "value") %>%
  ggplot()+
  geom_bar(aes(x = cluster, y = value, fill = as.factor(variable)), 
           stat = "identity", position = "dodge")+
   labs(
    title = "Venue Typologies",
     subtitle = "k = 4 geographic clusters",
     caption = "Data: CFP")+
  #ylab("Quartile")+
  plotTheme

8.2 Programming Clusters

There are three programming clusters in the venue sample. They are characterized as follows:

  1. Creative engines - Medium-to small venues, of median age (~3-10 years) with the most highly rated programming. Located in mor outlying areas. (The second largest cluster)

  2. Mainstream - Large venues, of median age, with above below average programming rankings. (The smallest cluster)

  3. Copenhagen legacy venues - larger venues with high programming rankings, on average, much older with below average programming rankings. Venues in this category tend to be much lower ranked in other cities. (The largest cluster)

set.seed(123)

library(cluster)
library(factoextra)

# function to compute total within-cluster sum of square 
wss2 <- function(k) {
  kmeans(main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(size) == FALSE,
                  is.na(promotion) == FALSE,
                  is.na(creative_output) == FALSE,
                  is.na(experimentation) == FALSE,
                  is.na(community_focus) == FALSE,
                  is.na(years_operating) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>%
           filter(is.na(years_operating) == FALSE) %>%
           select(size, promotion, creative_output, experimentation, community_focus, years_operating), k, nstart = 10 )$tot.withinss
}

# Compute and plot wss for k = 1 to k = 15
k.values2 <- 1:15

# extract wss for 2-15 clusters
wss_values2 <- map_dbl(k.values2, wss2)

# Run "elbow plot" to determine optimal cluster number.
plot(k.values2, wss_values2,
       type="b", pch = 19, frame = FALSE, 
       xlab="Number of clusters K",
       ylab="Total within-clusters sum of squares")

# Create a data object with cluster numbers
test2 <- kmeans(main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(size) == FALSE,
                  is.na(promotion) == FALSE,
                  is.na(creative_output) == FALSE,
                  is.na(experimentation) == FALSE,
                  is.na(community_focus) == FALSE,
                  is.na(years_operating) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>%
           filter(is.na(years_operating) == FALSE) %>%
           select(size, promotion, creative_output, experimentation, community_focus, years_operating), centers = 3, nstart = 25)

main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(size) == FALSE,
                  is.na(promotion) == FALSE,
                  is.na(creative_output) == FALSE,
                  is.na(experimentation) == FALSE,
                  is.na(community_focus) == FALSE,
                  is.na(years_operating) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>%
           filter(is.na(years_operating) == FALSE) %>% 
  cbind(test2$cluster) %>% 
  rename(cluster = 'test2$cluster') %>% 
  group_by(cluster) %>%
  summarize(med_size = median(size),
            med_promotion = median(promotion),
            med_creative_output = median(creative_output),
            med_experimentation = median(experimentation),
            med_community_focus = median(community_focus),
            med_age = median(years_operating)) %>%
  gather(-cluster, key = "variable", value = "value") %>%
  ggplot()+
  geom_bar(aes(x = variable, y = value), stat = "identity")+
  facet_wrap(~cluster)+
   labs(
    title = "Venue Typologies",
     subtitle = "k = 3 clusters based on normalized programming metrics",
     caption = "Data: CFP")+
  #ylab("Quartile")+
  plotTheme

main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(size) == FALSE,
                  is.na(promotion) == FALSE,
                  is.na(creative_output) == FALSE,
                  is.na(experimentation) == FALSE,
                  is.na(community_focus) == FALSE,
                  is.na(years_operating) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>%
           filter(is.na(years_operating) == FALSE) %>% 
  cbind(test2$cluster) %>% 
  rename(cluster = 'test2$cluster') %>% 
  group_by(cluster) %>%
  tally() %>%
  ggplot()+
  geom_bar(aes(x = cluster, y = n), stat = "identity")+
    labs(
    title = "Total Venues per Typology",
     subtitle = "k = 3 clusters based on normalized programming metrics",
     caption = "Data: CFP")+
  plotTheme

cluster_data2 <- main_venue_data %>% 
           filter(city == "Copenhagen", 
                  is.na(size) == FALSE,
                  is.na(promotion) == FALSE,
                  is.na(creative_output) == FALSE,
                  is.na(experimentation) == FALSE,
                  is.na(community_focus) == FALSE,
                  is.na(years_operating) == FALSE) %>% 
           mutate(years_operating = as.numeric(years_operating)) %>%
           filter(is.na(years_operating) == FALSE) %>% 
  cbind(test2$cluster) %>% 
  rename(cluster = 'test2$cluster')

cluster_data2 %>% 
  mutate(cluster = as.factor(cluster)) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326) %>%
          mapView(., zcol = "cluster" )+
  mapview(d2_aggregates %>%
            select(d2_name))
cluster_data2 %>% 
  mutate(cluster = as.factor(cluster)) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_join(d2_aggregates %>% st_transform(4326)) %>%
  as.data.frame() %>%
  group_by(d2_name, cluster) %>%
  tally() %>%
  spread(key = cluster, value = n) %>%
  filter(is.na(d2_name) == FALSE) %>%
  rename (creative_engines = `1`,
          mainstream = `2`,
          legacy = `3`)
## # A tibble: 10 × 4
## # Groups:   d2_name [10]
##    d2_name                   creative_engines mainstream legacy
##    <chr>                                <int>      <int>  <int>
##  1 Amager Øst                               2          1      2
##  2 Amager Vest                              1         NA      3
##  3 Bispebjerg                               5          2      1
##  4 Brønshøj-Husum                           1         NA     NA
##  5 Indre By                                10         10     21
##  6 Nørrebro                                 8         NA      9
##  7 Østerbro                                 1          5      2
##  8 Valby                                    1          1      1
##  9 Vanløse                                  2          1     NA
## 10 Vesterbro/Kongens Enghave                5          6      5

9. Key district profiles

Sources: CFP, Rezk(2020), City of Copenhagen: Statbank, Finans Danmark

profiles <- d2_aggregates %>%
  mutate(pop_t2 = as.numeric(pop_t2),
         pop_t1 = as.numeric(pop_t1),
         pop_youth_t1 = as.numeric(pop_youth_t1),
         pop_youth_t2 = as.numeric(pop_youth_t2),
         rent_t1 = as.numeric(rent_t1)/1000,
         rent_t2 = as.numeric(rent_t2)/1000,
         income_t1 = as.numeric(income_t1),
         income_t2 = as.numeric(income_t2)) %>%
  mutate(pop_pct_change = (pop_t2 - pop_t1)/pop_t1 *100, pop_youth_pct_change = (pop_youth_t2 - pop_youth_t1) / pop_youth_t1 *100, 
         pct_change_rent = (rent_t2 - rent_t1) / rent_t1 * 100, 
         pct_change_income = (income_t2 - income_t1) / income_t1 *100, venue_per_km2 = venue_count / area_km2, 
         med_size = ((4* size_4) + (3*size_3) + (2* size_2) + (1* size_1))/venue_count) %>%
  select(d2_name, pop_t1, pop_t2, pop_pct_change, pop_per_km2_t1, pop_per_km2_t2, pop_youth_t1, pop_youth_pct_t1, pop_youth_t2, pop_youth_pct_t2, pop_youth_pct_change, pop_youth_per_km2_t1, pop_youth_per_km2_t2, pop_per_venue_t1, pop_youth_per_venue_t1, pop_per_venue_t2, pop_youth_per_venue_t2, rent_t1, rent_t2, change_rent, pct_change_rent, income_t1, income_t2, change_income, pct_change_income, station_count, transit_stations_per_km2, venue_count, venue_per_km2, med_size, venue_count, mean_promotion, mean_experimentation, mean_creative_output, mean_community_focus) %>%
  st_drop_geometry()

write_csv(profiles, "~/GitHub/CFP/unified__city_data/copenhagen/district_profiles/profiles_table.csv")

profiles %>%
  select(d2_name, pop_t2,rent_t2, income_t2, med_size, transit_stations_per_km2,
         venue_count, mean_promotion, mean_experimentation, 
         mean_creative_output, mean_community_focus ) %>%
  left_join(., cluster_data2 %>% 
  mutate(cluster = as.factor(cluster)) %>%
          st_as_sf(coords = c("x", "y"), crs = 4326) %>%
  st_join(d2_aggregates %>% st_transform(4326)) %>%
  as.data.frame() %>%
  group_by(d2_name, cluster) %>%
  tally() %>%
  spread(key = cluster, value = n) %>%
  filter(is.na(d2_name) == FALSE) %>%
  rename (creative_engines = `1`,
          mainstream = `2`,
          legacy = `3`)) %>%
  mutate(across(where(is.numeric), ~ signif(., digits = 4))) %>%
  kable() %>%
  kable_styling() %>%
    scroll_box(width = "650px", height = "400px")
d2_name pop_t2 rent_t2 income_t2 med_size transit_stations_per_km2 venue_count mean_promotion mean_experimentation mean_creative_output mean_community_focus creative_engines mainstream legacy
Vanløse 40660 36.62 382600 2.667 0.7502 3 3.000 3.000 3.000 2.333 2 1 NA
Valby 65890 38.01 367800 3.667 0.8707 3 2.667 2.667 3.000 3.333 1 1 1
Brønshøj-Husum 43930 31.06 349900 4.000 0.1150 1 4.000 3.000 4.000 3.000 1 NA NA
Indre By 56810 58.18 504600 2.786 0.9598 42 3.000 2.452 3.190 2.429 10 10 21
Østerbro 80660 51.79 427600 3.667 0.8228 9 2.333 2.222 2.667 1.556 1 5 2
Vesterbro/Kongens Enghave 78890 53.36 405400 2.875 1.3380 16 2.812 2.812 3.000 2.500 5 6 5
Amager Øst 62100 42.16 373500 3.400 0.5378 5 3.000 2.200 3.200 1.800 2 1 2
Amager Vest 86350 42.16 386100 3.750 0.3114 4 3.500 2.250 3.750 2.000 1 NA 3
Bispebjerg 54240 37.73 317300 2.375 0.5886 8 3.000 2.375 3.375 3.000 5 2 1
Nørrebro 78510 48.65 330700 2.235 0.7341 17 3.235 3.118 3.647 3.235 8 NA 9

9.1. Indre By

Indre By, the historic city center of Copenhagen, is a vibrant and compact district situated on a small island surrounded by picturesque canals. Dating back to the 12th century, it has evolved from a fishing village into a bustling market town, witnessing significant historical events and architectural transformations. Landmark attractions such as the Rundetaarn, Rosenborg Castle, and the Church of Our Saviour highlight its rich heritage. Indre By boasts the highest number of music venues in the sample, with 42 spaces that cater to various tastes, including “Legacy,” “Creative Engine,” and “Mainstream” venues.

While the district has relatively high content scores for a city center, indicating a dynamic cultural scene, it faces challenges due to high housing prices, which limits venues’ growth potential. Enhanced by excellent public transit access, Indre By attracts both locals and tourists, offering a unique blend of historical significance and modern vibrancy. This eclectic mix fosters creativity, but the economic pressures can hinder the development of new cultural spaces, making it crucial to find a balance between maintaining its rich heritage and nurturing the creative community.

9.2. Nørrebro

Nørrebro is a dynamic and diverse district in north of Indre By, known for its multicultural atmosphere and vibrant street life. With a rich history that includes significant working-class roots, Nørrebro has transformed into a hub of creativity and social engagement. The area features a mix of historic architecture and modern developments, with popular landmarks like the Assistens Cemetery, where notable figures such as Hans Christian Andersen are buried, and the lively Nørrebrogade, lined with shops and cafes.

Nørrebro is also home to a thriving cultural scene. Nørrebro boasts the second-highest number of music venues in the city, with 17 spaces - yet it lacks mainstream venues, highlighting its focus on alternative and independent programming. The district enjoys very high program scores, reflecting its vibrant artistic community. Rents here are not increasing as rapidly as in the city center, growing 17% slower than Indre By, making it a more affordable option for residents and creative ventures. As gentrification looms, Nørrebro is positioned as an ideal location for affordability controls and preservation efforts. The 2019 Kommunenplan specifically identifies envisions Nordvest as a burgeoning area for creative entrepreneurship.

9.3. Vesterbro/Kongens Enghave

Vesterbro is a vibrant district situated just west of Copenhagen’s city center, known for its eclectic mix of culture and nightlife. It features a blend of residential neighborhoods and bustling streets, making it a popular destination for both locals and tourists. Historically, Vesterbro was a working-class area that underwent significant transformation in the late 20th century, evolving into a hub of creativity and entertainment. Key landmarks include the iconic Tivoli Gardens, a historic amusement park, and Istedgade, renowned for its unique shops and eateries.

With 16 music venues, Vesterbro has above-average venue density, contributing to its lively cultural scene. The district’s program scores are average, indicating a solid array of offerings. Property values are relatively high, reflecting the area’s desirability, while the average income in Vesterbro stands at approximately 405,363 DKK, suggesting a well-off demographic. Property values are relatively high. The district benefits from high transit density, making it easily accessible.

Currently, Vesterbro is undergoing significant transformation, particularly around its industrial areas, such as the former meatpacking district, now reimagined as a creative hub with galleries, restaurants, and event spaces. This shift has fostered new cultural opportunities while maintaining a relative balance between different venue types, enhancing the district’s cultural diversity and appeal.

9.4. Østerbro

Østerbro is an affluent district located to the northeast of Copenhagen’s city center, characterized by its spacious parks, quiet residential streets, and proximity to the waterfront. Historically, Østerbro has evolved from a primarily working-class area into a desirable neighborhood known for its family-friendly environment and green spaces. Notable landmarks include the stunning Fælledparken, the largest park in Copenhagen, and the iconic St. Alban’s Church, a striking example of neo-Gothic architecture.

With 9 music venues, Østerbro boasts above-average venue density, and the median venue size is notably large, allowing for a variety of events and performances. However, the district has low program scores, owing to its concentration of Mainstream programming. Property values in Østerbro are relatively high, with housing prices averaging around 51,794 DKK per square meter, relatively high comparing to local residents’ income. The district is currently undergoing significant waterfront redevelopment, enhancing its appeal and accessibility. It is reasonable to expect that this development, along with the potential arrival of mainstream venues, will continue unless affordability controls are implemented. The 2019 Kommunenplan highlights Outer Nordhavn as a key area where industrial buildings can be repurposed for creative uses, further contributing to Østerbro’s evolving cultural landscape.

Appendix - Experimental Regression Analysis

How is venue density predicted by transit density (controlling for city)?

This explains roughly 20% of the variance in district venue density.

d1_aggregates %>%
    as.data.frame() %>%
    filter(city_en %in% c("Tokyo", "Berlin", "Montreal", "Stockholm")) %>%
    filter(! d1_name %in% remove_arrondissements) %>%
    mutate(venue_dens = venue_count / area_km2,
           venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
           city  = case_when(city_en == "Berlin" ~ "1. BERLIN, 2017",
                             city_en == "Tokyo" ~ "3. TOKYO, 2019",
                             city_en == "Stockholm" ~ "4. STOCKHOLM, 2021",
                             city_en == "Montreal" ~ "5. MONTREAL, 2022")) %>%
    select(city, venue_dens, transit_stations_per_km2) %>%
    rbind(., d2_aggregates_all %>%
              as.data.frame() %>%
              filter(city_en %in% c("Sydney", "New York", "Rotterdam")) %>%
              mutate(venue_dens = venue_count / area_km2,
                     venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
                     city  = case_when(city_en == "New York" ~ "2. NEW YORK, 2018",
                                       city_en == "Sydney" ~ "6. SYDNEY, 2023",
                                       city_en == "Rotterdam" ~ "7. ROTTERDAM, 2024")) %>%
              select(city, venue_dens, transit_stations_per_km2)) %>%
    rbind(., d2_aggregates %>%
              as.data.frame() %>%
              mutate(venue_dens = venue_count / area_km2,
                     venue_dens = ifelse(venue_dens == 0, 0, venue_dens),
                     city  = "8. COPENHAGEN, 2024") %>%
              select(city, venue_dens, transit_stations_per_km2)) %>% 
  lm(data = ., venue_dens ~ transit_stations_per_km2 + city) %>% 
  summary(.)
## 
## Call:
## lm(formula = venue_dens ~ transit_stations_per_km2 + city, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.2732 -1.4826 -0.6306  0.0195 27.1161 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               0.56424    1.16027   0.486  0.62743    
## transit_stations_per_km2  1.40352    0.25056   5.602 9.29e-08 ***
## city2. NEW YORK, 2018    -0.09615    1.24584  -0.077  0.93858    
## city3. TOKYO, 2019       -1.20310    1.45622  -0.826  0.40996    
## city4. STOCKHOLM, 2021   -0.41117    1.94280  -0.212  0.83266    
## city5. MONTREAL, 2022     0.79504    1.63055   0.488  0.62652    
## city6. SYDNEY, 2023       3.81366    1.45100   2.628  0.00943 ** 
## city7. ROTTERDAM, 2024    0.24548    1.77552   0.138  0.89021    
## city8. COPENHAGEN, 2024  -0.16293    1.66990  -0.098  0.92240    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.821 on 157 degrees of freedom
##   (154 observations deleted due to missingness)
## Multiple R-squared:  0.2385, Adjusted R-squared:  0.1997 
## F-statistic: 6.148 on 8 and 157 DF,  p-value: 6.825e-07

Appendix - Methods and Sources

Methodology

This section describes the analysis methods used throughout the study.

Districts

Cities have various levels of administrative and cultural districts, from the block to the entire municipal boundary, which impact people’s urban experience and define jurisdictions and decision making powers. Three district levels that are based on statistical census areas as well as legislative areas are used in the analysis:

d_1. City of Copenhagen form the broadest level of aggregation to align analysis with decision making bodies.

d_2. 10 districts based on Census Statistical Areas are used to generate insights at the district scale, which also corresponds to nationally recognized census boundaries. This analysis was undertaken at the d_2 level.

d_3. Zip code areas are the smallest spatial unit of research and analysis.

Census

Demographic data were obtained for two time periods (t1, t2). The variables used include:

  1. pop_t1 & pop_t2: population of each district (2017, 2022)
  2. pop_youth_t1 & pop_youth_t2: population of the 20-29 years age group (2017, 2022)
  3. income_t1 & income_t2: Income for persons(14 years+) by district. Personal income in total (ex. imputed rent and before deductions of interest expenses). In DKK. Table KKIND3 (2017, 2022)
  4. rent_t1 & rent_t2 : Property prices on the housing market by district, property category and prices of completed transactions. For condominiums – mean market price, DKK/square meter. Table UDB020. (2017, 2022)

Limitations: 1. d1 and d3 levels of demographic data are not as accessible as other CFP cities, while only d2 level is used for quantitative analysis. 2. Condominiums takes the major residential housing type in Copenhagen. Due to limit of access to data sources, Condominium is taking as a subject of study for property values.

Transportation

Public transit networks provide insights into the varying levels of accessibility that exist within cities.61 stations composed of stations for lines M1, M2, M3, and M4 of Copenhagen metro and stations for the S-train.

Venues

Venue data is collected through CFP research and local workshops. 108 venues have been identified and ranked by community members according to a variety of metrics that assess venue features (type, age, social media presence), spatial characteristics (address, location, accessibility, size), and programming (number of events, number of uses, experimentation, creative output).

Data aggregation

Final data aggregation combines districts, census, transportation and venue data into a single data set, which contains the unique demographic, transportation and venue information for each borough. Additional variables are calculated to assess trends like venue density and relationships between venues, demographics and transit.

Sources

  1. All venue data, which includes the selected venues and their corresponding geographic information, characteristics and rankings, were obtained through CFP research and local workshops throughout 2024.

  2. Geographic district boundaries were obtained from Open Data DK in year 2024.

https://www.opendata.dk/city-of-copenhagen/bydele

  1. Transportation data, which includes 2024 train station entrance locations, were obtained from Clustering of Copenhagen Stations research conducted by Anas Rezk. The data was released on May 27, 2020.

https://github.com/rezkanas/CLUSTERING-OF-COPENHAGEN-TRANSPORT-STATIONS/blob/master/stations_venues_P.csv

  1. Demographic data were obtained from the City of Copenhagen: Statbank for years 2017 and 2022

https://kk.statistikbank.dk/statbank5a/SelectVarVal/Define.asp?MainTable=KKBEF1&PLanguage=1&PXSId=0&wsid=cflist

References to planning initiatives in Copenhagen relate to the following documents:

CITATIONS FOR DOCUMENTS

City of Copenhagen. (2019). KommunenPlan 2019 Copenhagen’s municipal plan 2019. City of Copenhagen. Retrieved from https://kp19.kk.dk/copenhagen-municipal-plan-2019

Madsen, M. D., Paasch, J. M., & Sørensen, E. M. (2022). The many faces of condominiums and various management structures − The Danish case. Land Use Policy, 120, 106273. https://doi.org/10.1016/j.landusepol.2022.106273